Ada-Rapporteur-Group / User-Community-Input

Ada User Community Input Working Group - Github Mirror Prototype
28 stars 1 forks source link

size of immutably limited unconstrained discriminated components #111

Open swbaird opened 3 hours ago

swbaird commented 3 hours ago

A GNAT user complained that Storage_Error is raised when elaborating the declaration of an object of an undiscriminated record type that has a component of the Queue type declared in an instance of Ada.Containers.Bounded_Synchronized_Queues. The component's subtype is unconstrained.

type R is record
  Q : My_BSQ_Instance.Queue;
end record;
Obj : R; -- raises Storage_Error

This does not appear to be a GNAT-specific problem.

It seems that this behavior is effectively a consequence of the RM's implementation advice. A.18.29 says Bounded queue objects should be implemented without implicit pointers or dynamic allocation. and then A.18.1 says Count_Type'Last should be at least 2**31–1.

This behavior clearly seems like a problem. Before discussing possible solutions, we will discuss the problem in more detail.

It is well accepted that a "reasonable" Ada implementation might raise Storage_Error when elaborating the following object declaration: type Too_Big (N : Natural := 0) is record S : String (1 .. N); end record; type R is record TB : Too_Big; end record; Obj : R; -- raises Storage_Error with most Ada implementations

This notion of a "reasonable" Ada implementation is not an RM-defined concept. RM 11.1(6) allows the elaboration of a null statement to raise Storage_Error. But it is well established that Ada implementations are allowed (although not required) to choose a contiguous representation for objects so that all the storage that an object will ever require is allocated when the object is declared. That is true in general, but is specifically confirmed for the Queue type by the implementation advice cited above.

But one might argue that this Queue example is different from the Too_Big example in an important respect: the Queue type is immutably limited. That means that the discriminants of a Queue object will never change after the object is initialized. In the case of a stand-alone object (as opposed to a component), this means that an implementation need only allocate enough storage to accomodate the initial discriminant values of the object - there is no possibility that the object will "grow" later. [Allocated objects are like stand-alone objects for purposes of this discussion.]

But the situation with components is different.

Consider the case where the default expression value for the discriminant in question is static. More complicated scenarios are possible, but we don't need that added complexity here.

Suppose we were to treat the default initial value as if it were a component subtype constraint (in the immutably limited case). That approach works in the case of a top-level object, but not for a component.

Assume the compiler treats (at least with respect to storage requirements) type T (N : Natural := 10) is limited record Chars : String (1.. N); end record; type R is limited record T_Component : T; end record; as if the component subtype for T_Component were not "T" but instead "T (10)".

That doesn't allocate sufficient storage if we ever encounter a value of type R where T_Component.N is larger than 10. For example

Obj : R := (T_Component => (26, "ABCDEFGHIJKLMNOPQRSTUVWXYZ"));

One could imagine (and this came up in earlier discussions of this problem) inventing new implicitly-defined-by-the-compiler discriminants for a composite type that has such a component type, but that doesn't work well if the enclosing type (or an enclosing type of the enclosing type, etc.) is an array type.

One solution which only involves changes to the Bounded_Synchronized_Queues package (as opposed to more general "language" changes) would be to add a new generic formal Max_Capacity : Count_Type := Default_Capacity; and then use that to declare a subtype subtype Bounded_Capacity is Count_Type range Count_Type'First .. Max_Capacity; which is then used as the subtype of the Capacity discriminant protected type Queue (Capacity : Bounded_Capacity := Default_Capacity; ...) is ...;

This will reduce Queue'Size substantially for most Ada compilers and most instances of the generic. And if users run into similar problems with their own user-defined types then they can make similar discriminant-subtype adjustments. But it will cause Constraint_Error to be raised if a Capacity value larger than Max_Capacity is ever specified.

Another approach that has been suggested would be delete the implementation advice and expect implementations to use a non-contiguous representation of some sort. That seems inconsistent with the reason for having a bounded version of the synchronized_queues package in the first place.

For a more general solution, the language could define a Boolean aspect specifiable for any immutably limited discriminated type having default discriminant values. The aspect might indicate that use of a noncontiguous representation (with associated storage management as needed when the size of an object changes) is expected. Or it might indicate that all component objects (as opposed to stand-alone or allocated objects) of the given type are implicitly constrained by the default discriminant values (this approach might require that the default discriminant values be static). If some such new aspect were to be defined, then presumably it would be specified for the Queue type in the Unbounded_Synchronized_Queues package.

In any case, the status quo is not good.

ARG-Editor commented 2 hours ago

I think the original intent was that for most uses, the Capacity would be specified when an object (including components) was declared. The defaults here are a convenience feature, not really critical to the usage of the feature. If they don't work, no big deal.

Note that in this case, we're talking about a protected object, for which no aggregate or explicit initialization is possible. In particular, you can't write something like: Obj : R := (Q=> (26, 10)); because the aggregate is illegal, and the only legal aggregate is: Obj : R := (Q=> <>); This cannot change the discriminant from the default. One cannot use an object name to get other discriminants as well, and there is no legal aggregate for a protected or task object. You can of course write an allocator, but that necessarily has a level of indirection.

Thus, I conclude for the actual problem case, there is no real problem. The compiler can treat the default discriminant as a constraint, since there is no way to change it. (The original GNAT implementation was OK for this type, there was no need to change it in this case.) GNAT is not "wrong" in this case, but it is being extremely unfriendly.

And then I would say that the more general case is the usual "question not asked", so let's not answer it.

For Janus/Ada, objects may be allocated with levels of indirection, as this is much more usable for the majority of users and uses. I have a plan to implement an aspect "Contiguous" to force an allocate-the-max implementation for safety-critical and other performance-critical uses. I wouldn't expect it to be used often.

The Ada Standard has been very carefully silent on the "allocate-the-max" strategy. I think it was always a mistake to allow that to be the sole strategy for implementation -- we don't allow limiting the range of a case statement in the ACATS, why do we allow limiting the range of a discriminant?? I would be very opposed to enshrining that mistake in the RM.

                 Randy.
Richard-Wai commented 2 hours ago

I'm really failing to see the issue here.. The real problem is that the user is trying to allocate something on the stack that's too big, which is not the language's fault. How much room is on the stack shouldn't be up to the language. My take-away of the implementation advice is to accommodate that reality. I fail to see how this is a limitation of the language, rather than a limitation of the target.

Generally I have found any attempt to allocate giant objects on the stack is a red-flag that the approach to the problem might not be great. In this case it seems obvious that you can either declare your Queue somewhere that the compiler assigns to .data, such as in a package, or you can force your bounded queue on to the heap with an allocator, at least you still get bounded memory use that way. I think it is better to be explicit about that then allow the compiler to do the same thing implicitly.

It seems weird that we'd try to bend the language to somehow circumvent a very reasonable implementation permission to deal with a limited stack. I further don't understand what the size of Count_Type has to do with anything? I don't see how that compiler should promise to allocate any Bounded_Queue with a Count_Type'Last Capacity on the stack?

This user report seems like they are trying to do something that is simply not possible on their target. And for something so unusual (allocating giant bounded queues on a stack), I find it a bit unreasonable to make the language bend over backwards to allow some kind of hidden heap allocation anyways, unless I'm misunderstanding the proposal?

swbaird commented 1 hour ago

Richard: The presence of default discriminant values is usually an indication that declaration of an unconstrained variable is intended to be ok. If we decide that this type really is potentially enormous, then shouldn't we remove those default discriminant values? If the uses described in this discussion are legal but result in raising Storage_Error for any type declared anywhere in the RM, that seems to me like we are setting a trap for unwary users to fall into. I think we want to avoid that.

Randy: I think you are mistaken about "no ... explicit initialization is possible" because of the possibility of a function call. Consider a function whose result type is the protected type, and which returns a result having discriminant values different than the defaults. So we could have an aggregate of our enclosing type whose protected component value is a function call. Regarding the "question not asked" point: the GNAT compiler changed its handling of this case as a result of a customer complaint that a case like this was incorrectly raising Constraint_Error.

Richard-Wai commented 1 hour ago

Richard: The presence of default discriminant values is usually an indication that declaration of an unconstrained variable is intended to be ok. If we decide that this type really is potentially enormous, then shouldn't we remove those default discriminant values? If the uses described in this discussion are legal but result in raising Storage_Error for any type declared anywhere in the RM, that seems to me like we are setting a trap for unwary users to fall into. I think we want to avoid that.

To me it seems pretty intuitive that the "default discriminant" in this case is just there to allow you to specify it with the instantiation of the generic, but I will agree that it ends up being awkward. It's certainly easy enough to just set up a constrained subtype... But that being said, I still think this is ultimately a combination of the compiler being less smart that I could be (I don't see why it can't figure out the actual size from the declaration of the object, even at run-time), and limitations of the target.

Could we possibly allow these sort of giant objects to be allocated on a secondary stack instead? Seems like a reasonable middle-ground given what we do with unconstrained return types already.

ARG-Editor commented 56 minutes ago

Randy: I think you are mistaken about "no ... explicit initialization is possible" because of the possibility of a function call. Consider a function whose result type is the protected type, and which returns a result having discriminant values different than the defaults.

But how do you do that? You can't return an existing object, and we've already ruled out writing an aggregate. There are lots of cases in Ada where it looks like a function would be an issue, but no such function can actually be written.

Re: "Question not asked" - you can't ask a bunch of different questions at the same time. If you want to talk about some other case that happened years ago, let's do that separately and with all of the details of that case. Not with some handwaving about someone's memory of some problem.