AdaCore / ada-spark-rfcs

Platform to submit RFCs for the Ada & SPARK languages
62 stars 28 forks source link

[RFC] External_Initialization aspect #100

Closed dkm closed 1 year ago

dkm commented 1 year ago

First draft for the 'Embed attribute External_Initialization aspect (inspired by C's #embed, Rust's include_bytes!() and probably others).

dkm commented 1 year ago

Rendered document

sttaft commented 1 year ago

As mentioned in another email chain about this, I would recommend avoiding a construct that looks executable/evaluable such as an attribute. The implication would be that each time it is encountered in the flow of execution, it is re-evaluated, implying reading a file at run time rather than compile time, which is clearly not the intent.

An alternative approach is to use an aspect on the declaration, with semantics approximating "Import" though without implying the object is allocated and initialized outside of Ada. Rather, we are saying it is a regular Ada object, but its initialization expression is defined in an external file. For example:

Some_Other_Data : Some_Record with External_Initialization => "/some/data.raw";

dkm commented 1 year ago

As mentioned in another email chain about this, I would recommend avoiding a construct that looks executable/evaluable such as an attribute. The implication would be that each time it is encountered in the flow of execution, it is re-evaluated, implying reading a file at run time rather than compile time, which is clearly not the intent.

An alternative approach is to use an aspect on the declaration, with semantics approximating "Import" though without implying the object is allocated and initialized outside of Ada. Rather, we are saying it is a regular Ada object, but its initialization expression is defined in an external file. For example:

Some_Other_Data : Some_Record with External_Initialization => "/some/data.raw";

This was the main subject when this RFC was discussed... IIRC, the main objection to the aspect is that you can only use it on the declaration. This would make it closer to the C' #embed than to Rust's include_bytes!().

@jklmnn proposed something like (extending the Import aspect/pragma) :

Some_File_Data : Byte_Array with
   Import,
   Convention => File,
   External_Name => "/some/file/data.raw";

pragma Import (Some_File_Data, File, "/some/file/data.raw");
raph-amiard commented 1 year ago

Yes there was indeed (very relative) agreement that the aspect version was a bit more flexible in how it could be used, for no visible drawback.

@sttaft While I see your point for me it's not generally clear anyway what "evaluation of an expression" entails in modern languages, so this is not a compoundly very big problem IMO.

OTOH I'm not sure being able to use this directly in expressions yields an expressive power that is very useful in most cases anyway.

I was not fond of re-using the Import aspect though, so if we go this way, I do indeed prefer having a specific aspect for this purpose.

sttaft commented 1 year ago

Yes there was indeed (very relative) agreement that the aspect version was a bit more flexible in how it could be used, for no visible drawback.

Did you mean the "attribute version" above?

@sttaft While I see your point for me it's not generally clear anyway what "evaluation of an expression" entails in modern languages, so this is not a compoundly very big problem IMO.

I don't understand this comment. Could you clarify? Ada certainly has a very clear notion of what it means to evaluate an expression, and it pretty much always has semantics equivalent to a run-time action, with the one exception being static expressions, which are generally done in infinite precision.

OTOH I'm not sure being able to use this directly in expressions yields an expressive power that is very useful in most cases anyway.

That was my sense. Personally I saw many complexities in trying to give the attribute version a good definition in Ada such that it was certain to be performed on the host machine at compile time, and I am sure Steve Baird could come up with an order of magnitude more complexities than me.

I was not fond of re-using the Import aspect though, so if we go this way, I do indeed prefer having a specific aspect for this purpose.

Import and Convention both seem to have the wrong semantics, so a new aspect would be preferable.

raph-amiard commented 1 year ago

Did you mean the "attribute version" above?

Yes

I don't understand this comment. Could you clarify?

What I mean is that we could, I think, define the evaluation of embed in such a way that the read is done only once, and every subsequent use is only a syntactic "inlining" of the content.

Furthermore, the version with an aspect does suffer from similar problems, if we don't restrict its use to the top level, or at least to constants:

procedure Foo (J : Integer) is
   Some_Other_Data : Some_Array with External_Initialization => "/some/data.raw";
   --  Is this some kind of global "static" (in the C sense) ? Or is a new copy of the data initialized every time ?
begin
   Some_Other_Data (J) := 15;
   Do_Something_With (Some_Other_Data);
end Foo;

procedure Bar is
begin
   for J in 1 .. 100 loop
      Foo (J);
   end loop;
end Bar;

That was my sense. Personally I saw many complexities in trying to give the attribute version a good definition in Ada such that it was certain to be performed on the host machine at compile time, and I am sure Steve Baird could come up with an order of magnitude more complexities than me.

We discussed it again, taking your feedback into account, yesterday, and discussing other potential topics, such as mutability, constness, where the data is located (stack or not), validity of the result, and the version I was most satisfied with was something akin to:

Would be interested to have your thoughts

sttaft commented 1 year ago

On Tue, Feb 14, 2023 at 9:54 AM Raphaël AMIARD @.***> wrote:

Did you mean the "attribute version" above?

Yes

I don't understand this comment. Could you clarify?

What I mean is that we could, I think, define the evaluation of embed in such a way that the read is done only once, and every subsequent use is only a syntactic "inlining" of the content.

But we never want it executed on the target machine, so it really needs to be a special notion of a host-evaluated expression, which is definitely a bit of an odd beast.

Furthermore, the version with an aspect does suffer from similar problems, if we don't restrict its use to the top level, or at least to constants:

procedure Foo (J : Integer) is Some_Other_Data : Some_Array with External_Initialization => "/some/data.raw";begin Some_Other_Data (J) := 15; Do_Something_With (Some_Other_Data);end Foo;

This is a bit less of an issue, because aspect specifications are frequently compile-time or link-time things, with strings that might never reside on the target, such as the External_Name aspect, and which are "processed" only once, on the host, during compile time. Exception declarations are similarly something that is effectively a compile-time thing, which are "processed" only once even if the subprogram is executed hundreds of times.

...

That was my sense. Personally I saw many complexities in trying to give the attribute version a good definition in Ada such that it was certain to be performed on the host machine at compile time, and I am sure Steve Baird could come up with an order of magnitude more complexities than me.

We discussed it again, taking your feedback into account, yesterday, and discussing other potential topics, such as mutability, constness, where the data is located (stack or not), validity of the result, and the version I was most satisfied with was something akin to:

  • We use an aspect like External_Initialization in your example above
  • The use of object declarations with such an aspect is restricted to library level packages (not a big problem for the kind of uses we envision, and solves a ton of potential problems)
  • The validity of the read data is checked at elaboration time, via a call to 'Valid on the read data

Would be interested to have your thoughts

Makes sense to me. I don't see the strong need to limit it to library-level objects (though limiting it to library-level types would definitely help), but it is always better to impose limitations initially and lift them later, than vice versa. I would agree that it is more efficient to only do the validity checks once, and that is more natural for library-level objects.

Take care, -Tuck

Message ID: @.***>

raph-amiard commented 1 year ago

@dkm Could you rewrite the RFC to take the above into account ? TIA :)

sttaft commented 1 year ago

In your examples you have: Blah'Valid;

I think you really want: pragma Assert (Blah'Valid);

sttaft commented 1 year ago

You have used the 'Valid attribute in cases where I believe you should be using the GNAT attribute 'Valid_Scalars. Ada's 'Valid attribute is not defined for arrays or records (see http://www.ada-auth.org/standards/22aarm/html/AA-13-9-2.html). Also, you have at least one misspelling of "literal" (you had two "t"s, I believe).

t-14 commented 11 months ago

As a reference, here is a lightweight mixed Ada/Asm example that allows us to achieve similar functionality today.

$ cat my.bin
Hello world!
$ cat my_bin.s
.section .rodata
  .global _binary_mybin_start
  .align 4
_binary_mybin_start:
  .incbin "my.bin"
_binary_mybin_end:
  .global _binary_mybin_size
  .align 4
_binary_mybin_size:
  .int  _binary_mybin_end - _binary_mybin_start

$ cat hello.adb
with Text_IO;

procedure Hello is
   Start    : constant Character
     with Import     => True,
          Convention => C,
          Link_Name  => "_binary_mybin_start";
   Length   : constant Integer
     with Import     => True,
          Convention => C,
          Link_Name => "_binary_mybin_size";
   Mybin : String (1 .. Length) with Address => Start'Address;
begin
   Text_IO.Put_Line (Mybin);
end Hello;

$ cat hello.gpr
project Hello is
   for Languages use ("Ada", "Asm");
   for Main use ("hello.adb");
end Hello;

$ gprbuild -q hello
$ hello
Hello world!

Presumably, we want to have an intrinsic facility that hides most/all the above book-keeping while providing similar functionality.

In that sense, two things that I'd like to point out:

  1. I think restricting it to library-level objects would be unfortunate. As you see, the example above doesn't have this limitation.
  2. Given that the proposal involves explicit mention of a file system object in the Ada source code, I am troubled by the associated issues. Requiring an absolute path is certainly a no-no as this would make the source non-relocatable. Supporting a relative path raises the question of "relative to what?" The asm's .incbin directive used above relies on -I path search mechanism which is not something that Ada currently has. Do we want to make it an "implementation permission" to define how this file object is searched?