Ada-Rapporteur-Group / User-Community-Input

Ada User Community Input Working Group - Github Mirror Prototype
26 stars 1 forks source link

Automatic/Transitive Generic Instantiation #93

Open OneWingedShark opened 3 months ago

OneWingedShark commented 3 months ago

(See AI12-0268-1/01)

Problem: Every Ada generic instantiation needs to be named and declared; when an instantiation is created solely to be passed to another generic instantiation, this requires picking a name for an instantiation that isn't intended to be visible to any other code. Presented is a method to allow an instantiation of a generic to automatically instantiate a formal parameter, assigning it to the name of that parameter in that instantiation.

Proposal: We can have default generic-parameters which are automatically instantiated transitively via formal parameters with minor syntactic changes, moreover, this proposal also allows the uniform handling of all generic parameters (i.e. subprograms and packages).

The formal syntax for the above method would essentially combine the instantiation syntax into the generic formal parameter syntax after USE, thereby altering12.6 (2/2) —

formal_subprogram_declaration ::=
formal_concrete_subprogram_declaration | formal_abstract_subprogram_declaration | formal_default_subprogram_declaration

and adding

formal_default_subprogram_declaration ::=
      use [overriding_indicator] procedure defining_program_unit_name is
        new generic_procedure_name [generic_actual_part] [aspect_specification];
    | use [overriding_indicator] function defining_designator is
new generic_function_name [generic_actual_part] [aspect_specification];

Because, as a default itself, the inclusion of “is null”, “is <>”, and “is default_name” are meaningless. (Note: syntax taken from 12.3 (2/3), it could probably be cleaned up.) and 12.7 (2/3) —

formal_package_declaration ::=
      with package defining_identifier is new generic_package_name
         formal_package_actual_part [aspect_specification];
 |    use package defining_identifier is new generic_package_name
         formal_package_actual_part [aspect_specification];

Examples: Given our current abilities with Ada’s generics and defaults we could say:

generic
    type Some_Type is private;
procedure Generic_Swap( A, B : in out Some_Type );

generic
    type Element is private;
    type Index is (<>);
    type Array_Type is array(Index range <>) of Element;
    with procedure Swap(A, B : in out Element) is <>;
procedure Generic_Sort( Input : in out Array_Type );

However, this requires an instantiation of Generic_Swap and results in something similar to the following:

-- Types. Type Integer_Vector is Array(Integer range <>) of Integer;
-- Instantiations. Procedure Swap is new Generic_Swap( Integer );
 Procedure Sort is new Generic_Sort(
Swap => Swap, -- This line could be deleted. Element => Integer, Index => Integer, Array_Type => Integer_Vector
);

With our current defaulting methods, we could eliminate the Swap parameter of the instantiation Generic_Sort, but we have enough information in the definition of Generic_Sort that we could eliminate both it and the currently required instantiation of Generic_Swap. So, with proper syntax, we could say something like:

generic
    type Element is private;
    type Index is (<>);
    type Array_Type is array(Index range <>) of Element;
    use procedure Swap is new Generic_Swap(Some_Type => Element);
procedure Generic_Sort( Input : in out Array_Type );

This also works with Packages and given the following —

generic
    type Element(<>) is private;
package Generic_Stack is
-- ...
end Generic_Stack;
generic
    type PostScript_Object(<>) is private;
    type PostScript_Float      is private;
    type PostScript_Context    is private;
    with package Object_Stack  is new Generic_Stack(PostScript_Object);
    with package Float_Stack   is new Generic_Stack(PostScript_Float);
    with package Context_Stack is new Generic_Stack(PostScript_Context);
package Generic_PostScript_VM is
-- ...
end Generic_PostScript_VM;
-- Type Stubs
type PS_Object is tagged null record;
type Real      is new Interfaces.IEEE_Float_64;
type Context   is null record;

we could reduce the required instantiations of —

-- Instantiations.
Package Object_Stack is new Generic_Stack(PS_Object'Class);
Package Real_Stack is new Generic_Stack(Real);
Package Context_Stack is new Generic_Stack(Context);
Package PostScript_VM is new Generic_PostScript_VM(
   PostScript_Object  => PS_Object'Class,
   PostScript_Float   => Real,
   PostScript_Context => Context,
   Object_Stack       => Object_Stack, 
   Float_Stack        => Real_Stack,
   Context_Stack      => Context_Stack
);

to:

 -- Instantiations.
package PostScript_VM is new Generic_PostScript_VM(
         PostScript_Object  => PS_Object'Class,
         PostScript_Float   => Real,
         PostScript_Context => Context
     );

simply by replacing the with by use in the definition of Generic_PostScript_VM.

Fabien-Chouteau commented 3 months ago

Hi @OneWingedShark,

Looking at this example:

generic
    type Element is private;
    type Index is (<>);
    type Array_Type is array(Index range <>) of Element;
    use procedure Swap is new Generic_Swap(Some_Type => Element);
procedure Generic_Sort( Input : in out Array_Type );

What would be the advantage compared to instantiating Generic_Sawp in the body of Generic_Sort?

generic
    type Element is private;
    type Index is (<>);
    type Array_Type is array(Index range <>) of Element;
procedure Generic_Sort( Input : in out Array_Type );

procedure Generic_Sort( Input : in out Array_Type ) is
   procedure Swap is new Generic_Swap(Some_Type => Element);
begin
OneWingedShark commented 3 months ago

What would be the advantage compared to instantiating Generic_Sawp in the body of Generic_Sort?

The proposed would automatically "chain" instantiation of the formal parameter to the enclosing generic AND give the instantiation a name (the formal parameter's name in the actual instantiation) that could be used externally. The example you give is strictly internal-use and cannot be used 'outside' the instantiation — in other words, this proposal allows for the generic-system to not only be used to decompose problems into subsystems, but also allow for fewer actual-instantiations to provide more instantiations for less effort.

This proposal also avoids the combinatorial explosion of other proposals because the core instantiation is still explicit.

sttaft commented 3 months ago

It would be useful to compare this with the proposal that allows the general use of an instantiation wherever a name of a subprogram or package is expected. Admittedly these would be anonymous instantiations, though the most worked-out proposal actually treats the parameter profile as a kind of unique name, so two instantiations with identical actual parameters would refer to the same instantiation. See rfc-structural-generic-instantiation.md

sttaft commented 3 months ago

It would also be interesting to relate this to the proposal allowing expression function syntax as the default for a subprogram: rfc-expression-functions-as-default-for-generic-formal-function-parameters.rst

OneWingedShark commented 3 months ago

It would be useful to compare this with the proposal that allows the general use of an instantiation wherever a name of a subprogram or package is expected. Admittedly these would be anonymous instantiations, though the most worked-out proposal actually treats the parameter profile as a kind of unique name, so two instantiations with identical actual parameters would refer to the same instantiation. See rfc-structural-generic-instantiation.md

One obvious thing to consider is that this proposal's automatic instantiations aren't anonymous (being bound to the particular instance), and therefore doesn't require an implementation that uses non-shared generics (like GNAT) to alter its model to force shared generics —exactly what the "deduplication" section in that proposal considers— instead allowing something more akin to "default parameters" (albeit that they're "the other way", binding the formal-parameter of the particular parameter to the same formal-parameter which it itself resides)… this proposal avoids essentially all of the discussion of "hoisting" because it ties back into the explicit instantiation: if it's instantiated with a non-default, then that parameter is used, if it's instantiated defaulting to something in that instantiation's formal parameter listing then that construct is scoped/dependent on the instantiation.

It seems to me that the 'structural' proposal imposes a LOT on the compiler-writers for arguably very little benefit: new rules, some of which are non-trivial (e.g. "We don't want to treat X, Y : Some_Generic (Integer, "+").T; differently than X : Some_Generic (Integer, "+").T; Y : Some_Generic (Integer, "+").T; with respect to allowing/forbidding instance sharing").

ARG-Editor commented 3 months ago

It would be useful to compare this with the proposal that allows the general use of an instantiation wherever a name of a subprogram or package is expected. Admittedly these would be anonymous instantiations, though the most worked-out proposal actually treats the parameter profile as a kind of unique name, so two instantiations with identical actual parameters would refer to the same instantiation. See rfc-structural-generic-instantiation.md

[The following is personal opinion, which should be obvious, but since it is coming from the ARG-Editor account, I mention it anyway - RLB.]

This rejected idea seems to have come back from the dead. I agree with the idea of comparing the proposed approach with other possible approaches, but I'd rather stick to proposals that have some chance of being adopted.

The main problem with the referenced proposal is that it damages a very important principle of Ada -- that the elaboration of entities is always well-defined (and generally at the point of declaration). The idea of sharing instances even if they come from different compilation units (that don't necessarily know anything about each other) is very dangerous, as almost all generic packages do some elaboration. And many generic packages include some state, which means the behavior of a program could change when a new unit that has an instance that happens to match that found in some other unit already in the program. This could wreak havoc with unit tests (given that the state of the program would differ between a unit test environment and in the environment of the entire system). I'd expect that the static analysis people would not be too happy with the non-determinism.

There have been a number of proposals to simplify instantiations that do not suffer from this non-determinism problem; I think we should focus on them.

sttaft commented 3 months ago

The main problem with the referenced proposal is that it damages a very important principle of Ada -- that the elaboration of entities is always well-defined (and generally at the point of declaration). The idea of sharing instances even if they come from different compilation units (that don't necessarily know anything about each other) is very dangerous, as almost all generic packages do some elaboration. And many generic packages include some state, which means the behavior of a program could change when a new unit that has an instance that happens to match that found in some other unit already in the program.

The referenced proposal includes a limitation to preclude mutable state:

Generics annotated with the Allow_Structural_Instantiation aspect are forbidden to have:

  • Mutable global state - TODO refine
  • Non in object formals

I believe Steve Baird has been involved in helping to refine its rules, so they are clearly thinking about the minutiae ... ;-)

ARG-Editor commented 3 months ago

[I hate having to rediscuss this with all of the previous knowledge completely forgotten...]

The referenced proposal includes a limitation to preclude mutable state:

Right, but that is a significant problem in practice:

(1) It introduces a major maintenance hazard: if state is added to a generic package, suddenly many uses become illegal. It's only safe to use generic procedures and pure generic packages if state is not allowed; any other type of package may have state and there is no claim to the contrary. Related to this, there is no way for a user of a non-pure generic package to know if it has state and can be used by this feature.

(2) It introduces a significant non-portability, since there is no way to know for a particular language-defined generic package whether it includes state (unless it is declared pure). For instance, most of the unbounded containers in Janus/Ada include some memory-management state. This means that the containers can't be used portably in this proposal (and I thought that was the use-case that justified all of these proposals).

(3) One could mitigate (1) and (2) by including some sort of aspect that declares that a generic package does not have state (if Pure isn't appropriate, which is won't be if it depends on anything that is not declared Pure). But then one would have to decide which language-defined packages get that declaration, and that could cause some implementers huge amounts of work (and an associated loss of functionality) - by essentially declaring their existing implementations wrong. It don't think we can justify that for a convenience feature (side-thought: with the likely rise of AI, I don't think we can justify any convenience features at all; if AI is writing the code, it doesn't matter that it is a bit more complex or verbose -- indeed, that is a feature given that the more complex/verbose code is also going to be more predictable and thus less likely for the AI to misinterpret).


I also note that the technique described in the referenced proposal is essentially a subset of the one patented by DEC back in the day. I don't think we should be including a requirement in the Standard that essentially requires an implementation to use a patented technique.

And, finally, I will say that there is no problem implementing generic sharing with the current rules of the language. If that is actually a good idea for an implementation, it should simply do it and keep the Standard out of it.

I think there is room in Ada for some sort of anonymous instantiation capability, but I don't think anything else should be tied to that proposal. Just have anonymous instances on their own in whatever contexts we decide make sense. (Recall that I had proposed anonymous instances solely in object declarations, in the same way that we allow anonymous arrays there -- and only there. See AI12-0215-1 - which I think could be simplified if we don't think getting access to cursors is sufficiently important to provide a mechanism.)

            Randy.