AdaCore / ada-spark-rfcs

Platform to submit RFCs for the Ada & SPARK languages
62 stars 28 forks source link

[RFC] declare local variables without a declare block #85

Closed sttaft closed 1 year ago

sttaft commented 2 years ago

Here is the first draft of an RFC proposing to allow local declarations as part of any "arm" of a compound statement, without introducing a declare block. Full text here

glacambre commented 2 years ago

Rendered link: https://github.com/sttaft/ada-spark-rfcs/blob/topic/rfc-local-vars-without-block/rfc-local-vars-without-block.md

I agree a lot with the goal, but I'm not sure I agree with the current RFC. In my opinion the added begin makes things very hard to read. Moreover it doesn't solve for the cases where you need a temporary variable, e.g. in the middle of a function - you would still need a declare block for that.

I think a better approach would be to introduce a let keyword (e.g. let X : Integer := 0;) which would allow declaring local variables without a begin and anywhere they are needed.

This could even be an opportunity to introduce better defaults (e.g. let declarations would be constant and not null by default, and mutable and nullable keywords could be introduced to change this behavior).

Introducing new keywords is of course a backward-compatibility issue, but I feel that now that libadalang exists, creating tools that would allow migrating from an Ada version to another would only take a reasonable amount of effort.

raph-amiard commented 2 years ago

FWIW I agree with Ghjuvan that this would be a better approach, and (modulo the new keywords, that could be implemented as reserved words so we don't have backwards compatibility issues) it also seems pretty simple to implement.

Fabien-Chouteau commented 2 years ago

I like @glacambre 's let proposal better than using begin. But I don't agree with it being constant by default or the introduction of nullable or mutable, because it don't think it's good to have it different than the rest of declarations.

There are already ways to have mutable/non-mutable and nullable/non-nullable in Ada.

let A : not null Some_Pointer_Type := new Something;
let B : constant Integer := 42;
let C : Natural := 42;
clairedross commented 2 years ago

Don't we have more chances of inclusion in the standard with the first version (with the begin)? It seems less earth shattering ;)

On Wed, Nov 3, 2021 at 11:31 AM Fabien Chouteau @.***> wrote:

I like @glacambre https://github.com/glacambre 's let proposal better than using begin. But I don't agree with it being constant by default or the introduction of nullable or mutable, because it don't think it's good to have it different than the rest of declarations.

There are already ways to have mutable/non-mutable and nullable/non-nullable in Ada.

let A : not null Some_Pointer_Type := new Something; let B : constant Integer := 42; let C : Natural := 42;

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/AdaCore/ada-spark-rfcs/pull/85#issuecomment-958856050, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6NCML57OWD6ICFHRVKBUDUKEFRPANCNFSM5HHGFYCQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Claire Dross Senior Software Engineer, AdaCore

raph-amiard commented 2 years ago

Don't we have more chances of inclusion in the standard with the first version (with the begin)? It seems less earth shattering ;)

Well it's about balance clearly. I think this is important enough to warrant new syntax (which wouldn't be the first time this is done in the history of Ada standardization).

glacambre commented 2 years ago

@Fabien-Chouteau I agree that having the default be different between declare-block declarations and let-declarations would be unfortunate, but I think it would be worth it for two reasons:

1) Such a discrepancy is not that big of deal in practice (e.g. JavaScript's var declarations differ a lot from the let ones but despite the ugliness this difference is considered a huge improvement by JavaScript developpers). 2) At the moment, if something isn't not null it might be either because the declaration holds the value null at some point in its lifetime or because the developper forgot to add not null to the declaration. Defaulting to not null and using nullable would force the developper to signal that the declaration is in fact allowed to be null.

However, @clairedross has a point, let declarations would already be a huge change and might be hard to get standardized, adding a mutable and a constant reserved words on top might make things harder. But on the other hand, this new safer by default behavior might actually work in favor of let declarations...

clairedross commented 2 years ago

On Wed, Nov 3, 2021 at 11:55 AM Ghjuvan Lacambre @.***> wrote:

@Fabien-Chouteau https://github.com/Fabien-Chouteau I agree that having the default be different between declare-block declarations and let-declarations would be unfortunate, but I think it would be worth it for two reasons:

  1. Such a discrepancy is not that big of deal in practice (e.g. JavaScript's var declarations differ a lot from the let ones but despite the ugliness this difference is considered a huge improvement by JavaScript developpers).
  2. At the moment, if something isn't not null it might be either because the declaration holds the value null at some point in its lifetime or because the developper forgot to add not null to the declaration. Defaulting to not null and using nullable would force the developper to signal that the declaration is in fact allowed to be null.

However, @clairedross https://github.com/clairedross has a point, let declarations would already be a huge change and might be hard to get standardized, adding a mutable and a constant reserved words on top might make things harder. But on the other hand, this new safer by default behavior might actually work in favor of let declarations...

I agree with Ghjuvan. I don't see this being standardized at any point unfortunately. It is too far from the current syntax and way of thinking too. Already the "let" syntax seems like a stretch to me...

-- Claire Dross Senior Software Engineer, AdaCore

mhatzl commented 2 years ago

I think allowing non-constants with let is very confusing to read.

What about decl instead of let? Like:

decl A : not null Some_Pointer_Type := new Something;
decl B : constant Integer := 42;
decl C : Natural := 42;

This RFC is kind of the short version of declare, so the shortend name (could not find if decl is already a reserved keyword).

I am not that fan of the begin approach as it makes if-statements quite unreadable for me when there are elsif or else branches.

sttaft commented 2 years ago

An advantage of the "begin" approach is that it naturally generalizes to allowing exception handlers as well.

If readability is a concern, rather than a full outdenting for "begin", a partial outdent could be used:

 if X > 5 then
     Squared : constant Integer := X**2;
   begin
     X := X + Squared;
 else
     Cubed : constant Integer := X**3;
   begin
     X := X + Cubed;
 end if;
mhatzl commented 2 years ago

An advantage of the "begin" approach is that it naturally generalizes to allowing exception handlers as well.

Ok I can see that as an slight advantage.

If readability is a concern, rather than a full outdenting for "begin", a partial outdent could be used:

 if X > 5 then
     Squared : constant Integer := X**2;
   begin
     X := X + Squared;
 else
     Cubed : constant Integer := X**3;
   begin
     X := X + Cubed;
 end if;

Wouldn't this go back to the disadvantage of having too much indentations? Or would you make it optional so people can choose if they want to have begin align with if or be partially outdent?

sttaft commented 2 years ago

Note that declarations and statements are syntactically distinct in Ada, so we could simply allow arbitrary interspersing of declarations and statements, which seems to be the desire expressed by wanting to introduce a local variable in the middle of a sequence of statements. This clearly has downsides, and is essentially incompatible with Ada's approach to exception handlers, which needs a clear separator between the declarations that are visible in the handler, and the statements (or declarations) that are covered by the handler, which is what "begin" provides. A solution to this would be to make the "begin" mandatory if there is an exception handler, but otherwise be optional.

Note that "begin" also is a point at which task activation occurs, but unfortunately whether or not an object contains a task might be hidden in the case of a limited private type. Probably the simplest rule would be that any task objects not yet activated by the time a statement is encountered would be activated then.

sttaft commented 2 years ago

As far as indenting, in Ada that is always up to the programmer and/or the project/company coding conventions.

mhatzl commented 2 years ago

Note that declarations and statements are syntactically distinct in Ada, so we could simply allow arbitrary interspersing of declarations and statements, which seems to be the desire expressed by wanting to introduce a local variable in the middle of a sequence of statements. This clearly has downsides, and is essentially incompatible with Ada's approach to exception handlers, which needs a clear separator between the declarations that are visible in the handler, and the statements (or declarations) that are covered by the handler, which is what "begin" provides. A solution to this would be to make the "begin" mandatory if there is an exception handler, but otherwise be optional.

If I understood you correctly, the following code would then be valid?

 if X > 5 then
     Squared : constant Integer := X**2;
     X := X + Squared;
 else
     Cubed : constant Integer := X**3;
     X := X + Cubed;
 end if;

And only for exceptions, begin would be needed?

sttaft commented 2 years ago

Yes, making "begin" optional is possible. The working group that had discussed this felt that without the "begin" it was harder to read, but I suppose so long as the "begin" can be used if desired, you could accommodate different approaches based on project coding conventions.

I will admit I don't like having too many options, as they do mean that you end up having to codify a lot of rules in coding conventions, and you end up with effectively multiple dialects. The fact that "in" is optional in Ada, in hindsight, seems like the wrong choice, since now you have a dialect where "in" is required, and one where it is disallowed, and perhaps one where it is required for procedures but disallowed for functions. This variety is not really in the best interest of having a cohesive community of programmers for the language.

clairedross commented 2 years ago

If I understood you correctly, the following code would then be valid?

if X > 5 then Squared : constant Integer := X2; X := X + Squared; else Cubed : constant Integer := X3; X := X + Cubed; end if;

And what if there are additional statements between the else and the declaration of Cubed, would it still be allowed?

-- Claire Dross Senior Software Engineer, AdaCore

Glacia commented 2 years ago

If the goal is to introduce an ability to declare variables inside if statements then why not use this opportunity to allow declaration of variables before if scope?

with Squared, Cubed : Integer  -- Squared and Cubed scope is within if and end if
if X > 5 then
     Squared := X**2;
     X := X + Squared;
 else
     Cubed := X**3;
     X := X + Cubed;
 end if;

This way you can call a function, save the results to variable, test it with an if, do something with that variable and discard it. C++17 introduced similar feature. The syntax is hypothetical, obviously.

mhatzl commented 2 years ago

I will admit I don't like having too many options, as they do mean that you end up having to codify a lot of rules in coding conventions, and you end up with effectively multiple dialects. The fact that "in" is optional in Ada, in hindsight, seems like the wrong choice, since now you have a dialect where "in" is required, and one where it is disallowed, and perhaps one where it is required for procedures but disallowed for functions. This variety is not really in the best interest of having a cohesive community of programmers for the language.

I agree.

Maybe as compromise, decl must be used if no begin is used, but exception handling is only possible with begin? This removes the possible conflict of forcing the use of begin or not?

onox commented 2 years ago

This could even be an opportunity to introduce better defaults (e.g. let declarations would be constant and not null by default, and mutable and nullable keywords could be introduced to change this behavior).

I agree that assuming a declaration to be constant and and not null would be a better default, but because the behavior is the opposite of regular declarations (between is and begin or in records) this could be confusing to the reader and increase the cognitive load if variables are mutable in case A but not mutable in case B.

IMO the whole language (all declarations and records) should switch to the new default then. This is of course wildly backwards incompatible with previously written code, so perhaps this should be turned on/off with a pragma per package or project. A code mod could be written to insert the mutable and remove the constant keyword.

The fact that "in" is optional in Ada, in hindsight, seems like the wrong choice, since now you have a dialect where "in" is required, and one where it is disallowed, and perhaps one where it is required for procedures but disallowed for functions. This variety is not really in the best interest of having a cohesive community of programmers for the language.

Another controversial opinion: could just in be added to Annex J? :stuck_out_tongue:

clairedross commented 2 years ago

IMO the whole language (all declarations and records) should switch to the new default then. This is of course wildly backwards incompatible with previously written code, so perhaps this should be turned on/off with a pragma per package or project. A code mod could be written to insert the mutable and remove the constant keyword.

I think a pragma changing the default like that would make Ada notably harder to read unfortunately... -- Claire Dross Senior Software Engineer, AdaCore

sttaft commented 2 years ago

As a general comment -- it might be worth taking this one step at a time... ;-)

sttaft commented 2 years ago

One possibility that came to mind today was that we could use something other than "begin" as the separator, in particular "in" which is used in many functional languages as part of a "let" construct, such as:

  let X = 42 in X * X

This would also eliminate the somewhat annoying ambiguity with using "begin" where if you do not have any declarations preceding it, it would be syntactically ambiguous with the start of a nested declare block. Also, "in" is lighter weight and might work better when only partially outdented. So as an example:

if X > 5 then
    Squared : constant Integer := X**2;
  in
    X := X + Squared;
else
    Cubed : constant Integer := X**3;
  in
    X := X + Cubed;
end if;

Choosing "in" has other advantages, because it is already a reserved word in Ada, and because there is no statement or declaration that can legally start with "in" at the moment.

glacambre commented 2 years ago

I like in better than begin, but I still think it's not the optimal approach to solving the issue of declare blocks being too syntactically heavy. If in is still tied to an if or loop statement, it doesn't solve the problem of needing to declare a new variable in the middle of a function.

mgrojo commented 2 years ago

Current Ada has the form:

declare-or-package-X-is-or-subprogram-X-is
  declaration-list;
begin
   statements;
end block-or-subprogram-or-package;

Why not generalize this pattern for any control structure? They have in common a final end, so the new rule is declare comes before any control structure or a block. The scope of the variables is until the corresponding end.

declare
    Squared : constant Integer := X**2;
    Cubed : constant Integer := X**3;
if X > 5 then
    X := X + Squared;
else
    X := X + Cubed;
end if;

That doesn't allow declaring variables in each of the branches, but for those corner cases you can still use a block (it's even better to still have a use for blocks, I think).

This can be expanded for exceptions as every end can have an exception handler, and it covers from the corresponding if, case, etc.

declare
    Squared : constant Integer := X**2;
    Cubed : constant Integer := X**3;
if X > 5 then
    X := X + Squared;
else
    X := X + Cubed;
exception
    when others =>
        raise My_Error;
end if;

I'm not convinced, though. This might be counterintuitive using the classical indentation for else, but maybe exceptions should be left for blocks after all.

If declaring variables in each branch is a must, I'd prefer to simply follow what other languages are doing and allow declarations as statements. Everything else is, in my honest opinion, complicating the syntax.

if X > 5 then
    Squared : constant Integer := X**2;
    X := X + Squared;
else
    Cubed : constant Integer := X**3;
    X := X + Cubed;
end if;
kevlar700 commented 2 years ago

I'm not 100% clear on the scope of "arm". I admit that I was adverse to declare blocks at first, despite understanding their benefits. Now I think that I really appreciate them.

It appears to me that whilst this proposal does not lose the inherent protection due to declares clarity of shadowing variable scope? It seems to degrade it, by losing the indentation.

ethindp commented 2 years ago

I have to disagree with this proposal. The idea of declaring variables "only when you need them" is something all other languages have, and I feel that the way Ada does things right now has several benefits that other languages don't give you:

  1. You know all the variables your going to use.
  2. Its eextremely easy to read and follow.
  3. Everyone who reads your code will always know what variables your subprogram uses. If you introduce a declare_expression, everyone still knows all the variables your going to use for that particular declare_expression as well as the variables for the subprogram.

Allowing declarative_part's anywhere within a subprogram_body just causes a mess. Even with a new reserved word, it still creates a mess. With the way things are now, there's a central source of truth regarding declarations within subprogram_bodys. I imagine it also makes implementation design simpler.

I know that pretty much all other languages allow this kind of thing, but the thing I like about Ada is the fact that it doesn't allow this. It makes subprograms a lot easier to read. I think that it'd be just as messy as saying "Well now lets allow you to declare subprograms associated to tagged types wherever you want", e.g.:

type foo is tagged private;
type bar is tagged private;
procedure do_something(f: in out foo);
-- ...

Which is also something that Ada doesn't allow you to do. I get it: allowing declarations anywhere within a subprogram_body would be nice for things like heavy math functions (for example), which are known to use a lot of variables. But I think that the way Ada works now makes it stand out, and unique, compared to other languages.

yannickmoy commented 2 years ago

@ethindp Just allowing more flexibility in Ada programs does not mean that you have to use it. In particular, anything that you don't like should be rejected automatically in your coding standard checker.

Regarding the parallel with tagged types, I did not understand it, as this code is legal and accepted by GNAT:

package Tags is
   type foo is private;
   type bar is private;
   procedure do_something(f: in out foo) is null;
private
   type foo is tagged null record;
   type bar is tagged null record;
end Tags;
ethindp commented 2 years ago

@yannickmoy Ah, weird, the last time I tried that in a library project it didn't work. I possibly had a style option on though -- I don't remember what my compiler options were and I don't have the GPR project file anymore, so... I agree that adding flexibility is fine, but I am still of the opinion that this is unnecessary.

yannickmoy commented 2 years ago

@ethindp thanks for sharing your thoughts! At this stage, this is really about making experimental features available, so that we can gain experience implementing and using it. Only later could this be included in Ada if the experiment is successful.

raph-amiard commented 1 year ago

This proposal is accepted for consideration, but has been merged manually due to conflict on the remote branch.

raph-amiard commented 1 month ago

@sttaft I think I misunderstood your intent here. In the RFC as currently expressed, and it seems to follow in GNAT's implementation, one can declare pretty much any kind of declaration in statement lists:

with Ada.Text_IO; use Ada.Text_IO;

procedure Test is
   pragma Extensions_Allowed (All_Extensions);
begin
    V : Integer := 0;

    type A is new Integer;
    Inst : A := 12;
    procedure Foo (Val : A) is null;

    Put_Line (V'Image);
end Test;

This seems to be correct according to the grammar and the RFC, since any basic_declarative_item is allowed, and basic_declarative_item allows every basic_declaration:

        3.11:
        basic_declarative_item ::= 
            basic_declaration | aspect_clause | use_clause

But it seems more ambitious than the intent of the RFC, or the conversation, which only mentions object declarations and maybe constant and renamings.

What do you think? Do we really want to allow everything, and handle the potential incidental complexity that arises?

ethindp commented 1 month ago

I know I wasn't mentioned but thought I'd give my input (I know I gave my input previously but...). I don't think accepting any basic_declarative_item is a good idea. Hopefully for numerous reasons that are hopefully obvious. Object decls are one thing; all of that is local to that scope. But things like type declarations or subprogram decls seems a bit much. If we're going to allow those, we might as well add lambdas while we're at it since, at least in my personal experience, a local declaration that's a subprogram in other languages is incredibly rare other than in a parameter to a subprogram (not to mention to me I get code smell vibes). But this is just me. I was going to bring up scope problems, but the RFC states those rules, but unless I missed it doesn't state the precise wording.

sttaft commented 1 month ago

This RFC evolved heavily from its original intent, and I was acting more as a scribe than as an instigator toward the end. Here is the paragraph describing the current choice of "basic_declarative_item":

Currently we are proposing to allow only "basic" declarative items in these contexts, so nested bodies are not allowed. Subprogram declarations are allowed, but only if they are defined by an expression function, an import, or an instantiation. One could argue that arbitrary declarations and bodies should be permitted. Alternatively, we could restrict it to only object declarations and use clauses, and no type declarations and no subprogram declarations. We have selected "basic_declarative_item" as this is already a well defined subset of all kinds of declarations that is used for package specs, and so is familiar to the programmer and doesn't require a newly selected subset.

I would be happy with a more limited capability. The "declare expression" has a fairly restricted syntax, and might be a good model here:

declare_expression ::= 
     declare {declare_item}
     begin body_expression

declare_item ::= object_declaration | object_renaming_declaration

Declare_item seems like it might be the right syntactic category, though I might suggest we allow "use" clauses as well.

raph-amiard commented 1 month ago

Ok thanks @sttaft ! That aligns with my recollection aswell