pmderodat / ada-toml

TOML parser for Ada
Other
31 stars 5 forks source link

Objects that are modified should be passed as in out parameters #10

Closed simonjwright closed 2 years ago

simonjwright commented 2 years ago

Looking at this in toml.ads we have

   procedure Set (Value : TOML_Value; Key : String; Entry_Value : TOML_Value)
      with Pre => Value.Kind = TOML_Table;
   --  Create an entry in Value to bind Key to Entry_Value. If Value already
   --  has an entry for Key, replace it.

In other words, the intent is to modify Value.

Now, TOML_Value is a tagged type and therefore passed by reference, so since Set only touches the internals it appears to work.

However, here in alire-user_pins.adb we have

   function To_TOML (This : Pin) return TOML.TOML_Value is
      use TOML;
      Table : constant TOML_Value := Create_Table;
   begin

      [...]

      Table.Set (Keys.Path,
                 Create_String (VFS.Attempt_Portable (Path (This))));

      Table.Set (Keys.Internal, Create_Boolean (True));

      return Table;
   end To_TOML;

and the compiler is entitled to ignore the two Table.Set calls -- which claim not to alter Table - and return the unmodified Table (it being constant).

I have a strong suspicion that this is exactly what is happening with alire built on Apple M1 (aarch64).

There are several other similar subprograms.

simonjwright commented 2 years ago

The unusual (shallow) design of a TOML_Value was what had me confused. Perhaps a note on the declaration like

 --  Note: all operations on this type use shallow copies!

would help a bit.

But, why? and how to know when altering variable A also alters variable B? Isn't there an AQ&S recommendation against it?!!

pmderodat commented 2 years ago

Hello @simonjwright

Sorry for the very late answer! I don’t think this design is that unusual: the TOML_Value data type has by-reference semantics, very much like what you would have in Ada with raw access types (just safer…), in Python with standard containers by default (dict, list, … are all handled by reference), on in similar C++ APIs. That being said, I agree that the documentation is very poor in this area, I’ll try to add helpful notes, thanks for the suggestion!

As to why: I find dealing with references in an imperative language much easier for recursive data structures than when primitives operate with by-value semantics. For instance, imagine an array A : TOML_Value that contains tables. I can hardly imagine an easy by-value API that inserts a key/value association in the first table in that array without creating copies, whereas with by-reference API, it is trivial:

A.Item (1).Set (Key, Value);

and how to know when altering variable A also alters variable B? Isn't there an AQ&S recommendation against it?!!

Unless there are easy ways to prevent this from happening (and I think that’s not the case when using by-value APIs with the kind of serious consequence I’ve described above), I think it’s hard to provide such guarantees without Rust-like borrow checkers…

It seems that you already found the bug that made you open this ticket, but I just wanted to add: IIUC the only case when an Ada compiler is allowed to remove such procedure calls (i.e. when the compiler does not have visibility over the procedure implementation while it is compiling the call) is when the procedure is pure, which is not the case here.