AdaCore / ada-spark-rfcs

Platform to submit RFCs for the Ada & SPARK languages
63 stars 28 forks source link

[RFC] Deferred Execution Proposal #29

Closed Jellix closed 1 year ago

Jellix commented 4 years ago

Language proposal to provide means of final actions to be taken regardless of code path actually executed. It is intended for safer resource management (paired calls) via deferred execution, adding the notion of an "extended exit statement".

Direct link to text: https://github.com/Jellix/ada-spark-rfcs/blob/master/considered/rfc-finalization.rst

yannickmoy commented 4 years ago

@Jellix, you say: "It is unclear, how exactly parameters for deferred statements are supposed to be evaluated. Firstly, of course, they should be evaluated at the time of defining them."

How do you imagine this working? There is no single deferred statement, but a block of code being deferred. I would drop this requirement and evaluate the whole block of code at the deferred point.

That's quite different from "defer" in Go where only a call is deferred. But maybe that's what you'd like to do most of the time in practice? Then we could imagine evaluating arguments at the point of declaration.

yannickmoy commented 4 years ago

About exceptions during evaluation of deferred statements you say: "Abort the whole execution and propagate the exception. That means, not all deferred execution statements are being executed which defeats the whole safety aspect (where part of the promise was that the compiler takes care of the resource management)."

This is how exceptions are handled during exception handling (if a new exception is raised by the handling code), what's the difference here?

yannickmoy commented 4 years ago

Regarding your point: "Presuming that deferred statements are allowed to change variables defined within the enclosing scope (if not, the whole thing will become rather useless), including in [out] parameters, how do we define the precise semantics of such modifications?"

I don't see a problem here. The execution of deferred statement is semantically well-defined, hence their effect as well. It's true that SPARK could help prevent these, but not the compiler, and it does not seem essential to me.

Jellix commented 4 years ago

@Jellix, you say: "It is unclear, how exactly parameters for deferred statements are supposed to be evaluated. Firstly, of course, they should be evaluated at the time of defining them."

How do you imagine this working? There is no single deferred statement, but a block of code being deferred. I would drop this requirement and evaluate the whole block of code at the deferred point.

Indeed, this seems the most straightforward solution.

That's quite different from "defer" in Go where only a call is deferred.

Yes, I think more than 90% of all cases would indeed be covered by the single procedure call (I mean, if all else fails, you can still declare one locally), but this seems a rather artificial restriction, doesn't it? (Except if it's maybe for ease of implementation?) I mean, the deferred code could also be a statement (like when you want to make sure that some variable is nulled out upon return), or the execution depends on some other variable's state, so you'd need a conditional, or if you'd need to call a function, not a procedure, even though you'd ignore the result (calls to OS functions like ReleaseSemaphore come into mind which return a success code), ...

So, in summary, I'd still like to have a "sequence of statements" there instead of restricting the deferred code to a single "procedure call". I mean, despite of the "defer" statement in Go being the inspiration here, we don't need to copy it verbatim, if we can have a more versatile solution.

But maybe that's what you'd like to do most of the time in practice? Then we could imagine evaluating arguments at the point of declaration.

After some consideration, I agree, the evaluation at the point of declaration seems counter-intuitive, and all parameters, if any, should be evaluated at the deferred point.

Jellix commented 4 years ago

About exceptions during evaluation of deferred statements you say: "Abort the whole execution and propagate the exception. That means, not all deferred execution statements are being executed which defeats the whole safety aspect (where part of the promise was that the compiler takes care of the resource management)."

This is how exceptions are handled during exception handling (if a new exception is raised by the handling code), what's the difference here?

Well, the idea was that deferred code is executed regardless of exceptions occurring, so I was struggling with that a bit. You're right that it's not really that much different from exceptions being raised while handling exceptions for one small difference: Inside an exception handler we already know that there was an exception (and we may even know it's identity), while by design a deferred block doesn't know anything about exceptions happening in the enclosing code. I have to think about possible semantic loopholes. But yes, if we allow exception handlers within deferred blocks, the user can actively handle those happening within a deferred block if that is so desired.

raph-amiard commented 4 years ago

Hi @Jellix !

Thanks for your proposal, and for following not yet documented best practices (linking to the full text) :slightly_smiling_face:

I understand the problem this RFC is intending to solve, but I think this is not the optimal way to solve it, and for several reasons:

  1. It doesn't couple object creation and initialization
  2. It doesn't mandate finalization of a resource
  3. It uses imperative statements for initialization/finalization, which is arguably bad practice. I understand that this allows compatibility with existing APIs, but it will encourage people to create APIs that fit this pattern, if the construct is in the language.
  4. It introduces a pretty complex syntactic form, for a purpose that would be better suited by a more restrictive version of controlled objects.

For example, it would be very easy with your proposal to write the following code:

declare
  Input_File : Ada.Text_IO.File_Type;
  Output_File : Ada.Text_IO.File_Type;
begin
  do
    Open (Input_File, "Some_File");
  and then at exit
    Close (Output_File);
  end exit;
  --  Invalid init/finalize pair, no checking

  Do_Processing_Of_Input_File_With_Possible_Exceptions_Being_Raised;
end;

And there is absolutely no information whatsoever for the compiler to detect that you did something wrong, since Open and Close have no special meaning.

The advantage of your proposal is to work with existing APIs, but I think the risk of incorrect code is too high, and also as said before, it will encourage people to write APIs using imperative init/finalize statements, which is bad.

Let's use something like RAII instead

Controlled objects are indeed meant to solve this problem. They're too complex for what we need in most cases, because:

  1. They require tagged types.
  2. The finalize operation can be dispatching.

What we would need instead is the notion of a "static" controlled type, that works on non tagged objects, such as:

package Make_Believe is
   type File (<>) is limited private;
   --        ^ prevents creation without initialization, so user has to go
   --          through call to `Open`

   function Open (Path : String) return File;
private

   type File is record ... end record
     with Finalize_Procedure => Close;

   procedure Close (Self : in out File);
end Make_Believe;

that you would then use like this

declare
    Input_File : File := Open ("Some file");
begin
   --  Do stuff with Input_File
end;

It has a few benefits:

  1. It's much shorter than the form you propose
  2. It requires no additional syntax added to the language
  3. It's by construction very hard to make mistakes
  4. The designer of the API chooses how initialization and finalization are done, not the caller.

The drawback is that it requires the design of new APIs. But I think designing new APIs is better than designing new syntax.

For the remaining use cases where what the users need is general finalization code (like logging), I think we should just add finally.

mosteo commented 4 years ago

I favor the general idea behind the proposal, not caring so much about specifics. In that sense Raphaël's alternative is very tight. Just a side note:

type File (<>) is limited private;
   --        ^ prevents creation without initialization, so user has to go
   --          through call to `Open`

I've been using this approach whenever possible and the hidden nag is that composition is hampered. You cannot have these types as record fields unless you resort to some indirection like Indefinite_Holders (which also entails using the heap where before it wasn't necessary).

With known discriminants you can at least have them in the containing record and use them to constraint the field. If there was a way of doing something equivalent with unknown discriminants...

stcarrez commented 4 years ago

IMHO, a begin, exception, finally statement sequence would be more readable and easier to understand. It also provides a well defined semantics if we consider that finally follows the Ada Controlled type finalization rules.

yannickmoy commented 4 years ago

I find @raph-amiard 's proposal strictly better, at the expense of adding two new features:

While finally blocks alone would have all the drawbacks that @Jellix mentioned, I think the combination is the best of both worlds.

pmderodat commented 4 years ago

stcarrez wrote:

IMHO, a begin, exception, finally statement sequence would be more readable and easier to understand. It also provides a well defined semantics if we consider that finally follows the Ada Controlled type finalization rules.

If we had to choose between finally and Finalize_Procedure (not sure if this is what you meant), I think Raphaël’s propositon is even more readable, because there’s not much to read. :-) Not introducing new syntax and adding as little semantic rules as possible is very appealing, and the RAII idiom helps writing code that is correct by construction: it all happens in the declaration of the variable.

Now, having both (as Raphaël suggests, IIUC) could indeed be useful: use the new aspect when you have several occurences of the same cleanup in your code (factorization) and use finally to avoid the boilerplate when writing single use clean up code (logging is a good example).

mosteo wrote:

I've been using this approach whenever possible and the hidden nag is that composition is hampered. You cannot have these types as record fields unless you resort to some indirection like Indefinite_Holders (which also entails using the heap where before it wasn't necessary).

It indeed hinders composition, but also later reassignment. Fortunately it’s not a downside compared to the original proposal as you can always remove the unknown discriminants: this way you renounce to mandatory initialization but you still benefit from this Finalize_Procedure aspect. I don’t think the original proposal addresses initialization at all.

clairedross commented 4 years ago

FWIW, I like this second proposal much better. I am not confortable with finalization which is attached neither to a block nor to an entity. It feels to error prone.

briot commented 4 years ago

If we had to choose between finally and Finalize_Procedure (not sure if this is what you meant), I think Raphaël’s propositon is even more readable, because there’s not much to read. :-) Not introducing new syntax and adding as little semantic rules as possible is very appealing, and the RAII idiom helps writing code that is correct by construction: it all happens in the declaration of the variable.

That’s actually not Raphael’s proposal as I understand it: it all happens in the “type declaration”, not the “variable declaration”. And I think I would prefer the latter, because it is a bit more flexible.

For sure, I can also see use cases where we systematically want the aspect at the type declaration, so I guess adding it at the variable declaration is an extension, not an alternative proposal.

Manu

yannickmoy commented 4 years ago

@briot all your use cases can also be coded by defining a subtype with the corresponding finalization, and defining a variable of that subtype. Unless the API provided actually enforces the use of finalization on the provided private type, and here it's a choice of the API implementer. So I'd favor the simpler approach of putting finalization on the (sub)type.

clairedross commented 4 years ago

Would such a Finalize aspect be allowed on a constant? Both your examples have the object as an in out parameter.

briot commented 4 years ago

@briot all your use cases can also be coded by defining a subtype with the corresponding finalization, and defining a variable of that subtype. Unless the API provided actually enforces the use of finalization on the provided private type, and here it's a choice of the API implementer. So I'd favor the simpler approach of putting finalization on the (sub)type.

If we can add such aspects to a subtype, this would work indeed (though one could argue that the variable declaration is just using an anonymous subtype, so we might as well have the aspect directly on the variable). In general though a lot of these aspects only apply to types, which is what I had assumed here too.

Only allowing on the subtype still means that my two examples requires users to create those subtypes. Whereas my examples work out of the box with the standard File_Type or the GNAT specific GNAT.Strings.String_Access, without a need to modify the corresponding packages, or have a utils.ads package (or equivalent) in my code.

briot commented 4 years ago

Would such a Finalize aspect be allowed on a constant? Both your examples have the object as an in out parameter.

I would say no, in the sense that I expect the profile of the finalize procedure to require an “in out” parameter, which is the case for both Ada.Text_IO.Close and GNAT.Strings.Free for instance.

So a constant cannot be passed to those subprograms

stcarrez commented 4 years ago

I like Raphaël’s propositon too! I wish I had the finally statement for logging purposes mostly.

If we have the Finalize_Procedure aspect, it looks like inheriting from the Controlled type is not necessary. Which can be great in some cases.

What would be an acceptable semantic for overloading this aspect? Taking Raphaël example and Emmanuel suggestion:

procedure My_Close (F : in out Make_Believe.File)...

F : Make_Believe.File with Finalize => My_Close;

Would the My_Close procedure be called and then the private Close procedure ?

briot commented 4 years ago

procedure My_Close (F : in out Make_Believe.File)...

F : Make_Believe.File with Finalize => My_Close; Would the My_Close procedure be called and then the private Close procedure ?

where:

What if the type is also controlled, do we also need to call Finalize ?

I think this is too much, and I would vote for a simpler rule like:

clairedross commented 4 years ago

And if Finalize is specified on a subtype/derived type, it overrides the Finalize of the parent I guess?

yannickmoy commented 4 years ago

@clairedross yes that would be my understanding, like in a controlled type.

clairedross commented 4 years ago

By the way, what about a component of a type with finalization inside a type with finalization? Should the user finalize the component manually, or is the finalize procedure of the component called implicitly ?

yannickmoy commented 4 years ago

@clairedross good question. TBD. both for the behavior with Finalize aspect, and the interaction with controlled types.

Jellix commented 4 years ago

First of all, I like your idea, Less boiler-plate while achieving almost the same effect. Some comments, though.

  1. It doesn't couple object creation and initialization

Well, it kinda tried to achieve it, but yes, I guess it fell a bit short of the goal. I tried to have a flexible solution, but maybe it turns out to be too flexible.

  1. It uses imperative statements for initialization/finalization, which is arguably bad practice. I understand that this allows compatibility with existing APIs, but it will encourage people to create APIs that fit this pattern, if the construct is in the language.

True. On the other hand, we very often have to cope with external (C-APIs), like bindings to operating system calls etc., so there we would be again writing wrapper code just to achieve automatic finalization.

And there is absolutely no information whatsoever for the compiler to detect that you did something wrong, since Open and Close have no special meaning.

Yes, I am aware of the issue, I even mentioned that in the proposal (at the end, that this is a bit unclear). Compiler checks would be nice, and your idea would indeed achieve that at the expense of being slightly less flexible.

The advantage of your proposal is to work with existing APIs, but I think the risk of incorrect code is too high, and also as said before, it will encourage people to write APIs using imperative init/finalize statements, which is bad.

I don't see why imperative statements are bad per se, it's like saying that "I := I + 1" is bad, if you can write "Add (I, 1)", or even "I.Add (1)" instead. I mean, appropriate finalization calls would still be bound to the object, it's just that they wouldn't be necessarily bound to the semantic of finalization which is something the compiler can't check anyway. But technically, I agree, proper encapsulation is always a good thing.

  1. It's much shorter than the form you propose
  2. It requires no additional syntax added to the language
  3. It's by construction very hard to make mistakes
  4. The designer of the API chooses how initialization and finalization are done, not the caller.

Agreed.

The drawback is that it requires the design of new APIs. But I think designing new APIs is better than designing new syntax.

It requires writing wrapper code for certain APIs, but for the sake of safer code, that's certainly something I could live with.

For the remaining use cases where what the users need is general finalization code (like logging), I think we should just add finally.

I'm not a big fan of finally. And doing both is a bit nasty as we would suddenly have two ways of achieving one thing which is usually something Ada tries to avoid like the plague.

pmderodat commented 4 years ago

Would such a Finalize aspect be allowed on a constant? Both your examples have the object as an in out parameter.

Both are not necessarily opposed. The Finalize procedure for controlled types already takes an in out formal, and yet it is called on constant objects. I guess the ARM has the notion of “constant view” on constants, which finalization bypasses.

What if the type is also controlled, do we also need to call Finalize ?

I think this is too much, and I would vote for a simpler rule like: - Finalize_Procedure aspect cannot be applied to a controlled type - Finalize aspect cannot be applied to a variable which type has Finalize_Procedure or is controlled

And if Finalize is specified on a subtype/derived type, it overrides the Finalize of the parent I guess?

The overriding part sounds interesting. Assuming this aspect can be defined on (derived) types declarations, subtype declarations and objects, what about using the “last one” defined? Object decl-level procedure if present, otherwise subtype declaration-level if present, otherwise type declaration-level.

raph-amiard commented 4 years ago

Hello @Jellix !

I don't see why imperative statements are bad per se, it's like saying that "I := I + 1" is bad, if you can write "Add (I, 1)", or even "I.Add (1)" instead.

They're not bad per-se, sorry if that's what I implied, and indeed the real bonus here is having the init/finalize directly tied to the type, so I agree with you. For initialization, it's just that tying initialization with object creation is easier to do "functionally" I guess.

First of all, I like your idea, Less boiler-plate while achieving almost the same effect.

Cool :slightly_smiling_face: I'll try and write an RFC then. Thanks for getting the ball rolling on this topic. We had internal discussions about this (mostly: controlled objects for all the people that cannot use them) but never got to actually write something!

briot commented 4 years ago

The overriding part sounds interesting. Assuming this aspect can be defined on (derived) types declarations, subtype declarations and objects, what about using the “last one” defined? Object decl-level procedure if present, otherwise subtype declaration-level if present, otherwise type declaration-level.

One of the drawback of allowing various overrides is that then we need to find a way to call the “inherited” finalize.

For instance, from the local (object-level) override of the Finalize aspect, I might want to call the finalize aspect of the corresponding type/subtype. Of course, we can do just like for tagged types and call explicitly the subprogram, but that’s ugly in case the declaration changes… So we end up needing some syntax like:

   procedure My_Close (F : in out File_Type) is
   begin
        Log (“finalizing”);
        File_Type’Finalize (F);
   end My_Close;

   F : File_Type with Finalize => My_Close;

Perhaps let’s keep this as a possible extension, or if we can find a real use-case for it ?

Allowing overrides also means that depending on the view of the variable, we might not know in advance what is going to be called on finalize.

There are even more complex corner cases than Claire was able to find :-) For instance:

 F : File_Type with Finalize => My_Close;
 F2 : File_Type_Access := new File_Type’(F);

 Unchecked_Free (F2);
 —  Should this call My_Close ? I don’t think it can be implemented with zero over-head, so
 —  I think it should not. But might this be surprising for users ? Perhaps a compiler warning
 —  would be needed, but that’s not part of the standard.
pmderodat commented 4 years ago

On the other hand, we very often have to cope with external (C-APIs), like bindings to operating system calls etc., so there we would be again writing wrapper code just to achieve automatic finalization.

I’m not sure I follow. I order to use C APIs, you already need to write wrapping Ada code: at least type and subprogram declarations with Import pragmas/aspects. In this context, adding a Finalize_Procedure to type declarations referencing imported subprograms only once and not having to repeat the call to the finalization procedure everywhere the involved types are used in yor codebase seems like a huge improvements, no?

For the remaining use cases where what the users need is general finalization code (like logging), I think we should just add finally.

I'm not a big fan of finally. And doing both is a bit nasty as we would suddenly have two ways of achieving one thing which is usually something Ada tries to avoid like the plague.

It’s not necessarily redundant. One can see Finalize_Procedure as the canonical way to have finalization for all values of a type, and finally only for ad hoc clean up, that isn’t tied to a specific type. So two constructs for two different use cases. It somewhat reminds me of the duality: functions versus procedures with an OUT parameter. Both are possible and can be considered redundant, but actually one can see both useful for disjoint use cases.

raph-amiard commented 4 years ago

For sure, I can also see use cases where we systematically want the aspect at the type declaration, so I guess adding it at the variable declaration is an extension, not an alternative proposal.

@briot Absolutely, and while I see the use for object-specific finalization, and I like the idea, I still think it's less prioritary than having proper type level finalization, because the case where the API designer chooses what mechanisms to enforce seems more important.

I think I like your idea, and I see how this could bridge the gap, eg. allow people to have a better mechanism for specific cases, or use it for legacy APIs, even if I'm not clear on the behavior in case of copies for example. I think we would still like to keep it as simple and static as possible though, and if possible, keeping it on (sub)types like @yannickmoy suggests sounds good;

@briot all your use cases can also be coded by defining a subtype with the corresponding finalization, and defining a variable of that subtype. Unless the API provided actually enforces the use of finalization on the provided private type, and here it's a choice of the API implementer. So I'd favor the simpler approach of putting finalization on the (sub)type.

If there is no problem making this mechanism static, even on subtypes, then I see no problem with that, but I guess it could be confusing still:

procedure Test is
   type R is record
      X, Y : Integer;
   end record;

   procedure Destruct (Self : in out R) is
   begin
      Put_Line ("In R.destruct");
   end Destruct;

   subtype RR is R with Finalize_Procedure => Foo;

   function Sum (Self : R) return Integer is
   begin
      return Self.X + Self.Y;
      --  Here, I guess Destruct is not called?
   end Sum;

   Inst : RR := (1, 2);
begin
   Put_Line (Sum (Inst)'Image);

   -- Here, Destruct is called
end Test;
raph-amiard commented 4 years ago

@briot for your example

     F : File_Type with Finalize => My_Close;
     F2 : File_Type_Access := new File_Type’(F);

     Unchecked_Free (F2);
     --  Should this call My_Close ? I don’t think it can be implemented with zero over-head, so
     --  I think it should not. But might this be surprising for users ? Perhaps a compiler warning
     --  would be needed, but that’s not part of the standard.

This is very similar to my example above with subtypes. Basically when you copy F in the creation of F2, you lose the object/subtype specific finalize.

One of the drawback of allowing various overrides is that then we need to find a way to call the “inherited” finalize.

For instance, from the local (object-level) override of the Finalize aspect, I might want to call the finalize aspect of the corresponding type/subtype. Of course, we can do just like for tagged types and call explicitly the subprogram, but that’s ugly in case the declaration changes… So we end up needing some syntax like:

For me:

  1. You wouldn't, you assume that the type level finalize is always called, whatever you do on the object. You just need to define the order, and since there can be only two overrides, it seems pretty simple.
  2. For me this is one reason the whole idea of object/subtype specific finalize should be left on the side for the moment, because it's quite more complex, and creates dozens of corner cases, while not adding much value IMHO, since you can always create a wrapper type anywhere in Ada if you need specific finalization logic.
raph-amiard commented 4 years ago

It indeed hinders composition, but also later reassignment. Fortunately it’s not a downside compared to the original proposal as you can always remove the unknown discriminants: this way you renounce to mandatory initialization but you still benefit from this Finalize_Procedure aspect. I don’t think the original proposal addresses initialization at all.

@pmderodat you're right, I've been using existing regular Ada for the initialization part. We might want to investigate that area, especially if we want to make this approach work in a not too surprising fashion on non limited types though!

Jellix commented 4 years ago

On the other hand, we very often have to cope with external (C-APIs), like bindings to operating system calls etc., so there we would be again writing wrapper code just to achieve automatic finalization.

I’m not sure I follow. I order to use C APIs, you already need to write wrapping Ada code: at least type and subprogram declarations with Import pragmas/aspects.

I was talking about vendor supplied APIs (Integrity API springs to mind) where they basically just provide a thin binding to the original C-API in Ada syntax. Also, thin bindings can be generated automatically these days, same restrictions apply.

I'm not a big fan of finally. And doing both is a bit nasty as we would suddenly have two ways of achieving one thing which is usually something Ada tries to avoid like the plague.

It’s not necessarily redundant. One can see Finalize_Procedure as the canonical way to have finalization for all values of a type, and finally only for ad hoc clean up, that isn’t tied to a specific type.

True, I didn't think it through. Even if you could bind the Finalize_Procedure aspect to not only a subtype, but also bind it to an object, like @briot suggested (and I'm in favor of that), there may actually be a use case for both. As I understand the suggestions so far, the Finalize_Procedure to be called would have a signature of procedure (T : in out Type_To_Be_Finalized);, no other parameters being allowed, so - for a good reason - it restricts what you can do with it. A finally would be less restricted.

raph-amiard commented 3 years ago

@yannickmoy @Roldak It seems this RFC was overlooked in your finalization WG.

In particular, it looks like you prefered to go with an approach similar to what I outlined in https://github.com/AdaCore/ada-spark-rfcs/pull/29#issuecomment-539025062.

If that's the case, I'll let you close this RFC.

raph-amiard commented 1 year ago

@Jellix We're getting ready to implement finally, so we reconsidered this, but this is much more complicated to implement.

In particular, in the context of embedded programming, defer-like capabilities will sometimes silently need unbounded space to keep track of the actions that need to be ran at subprogram-exit.

In general, it's much more effort to implement