modelica / ModelicaSpecification

Specification of the Modelica Language
https://specification.modelica.org
Creative Commons Attribution Share Alike 4.0 International
103 stars 41 forks source link

Improve function (for purity) #3589

Open HansOlsson opened 1 month ago

HansOlsson commented 1 month ago

Continuing #1937

The current definitions of pure/impure are lacking in a number of ways, and that is getting problematic - and for users the problem is that since the definitions are so lacking the tools don't bother to implement and check the semantics.

Specifically impure is too coarse-grained and we can broadly see three levels:

I used "level" to indicate that I don't see a need to care:

Being able to tell these apart has the following benefits in terms of clarity and future changes:

And also performance for the first level.

A preliminary implementation in Dymola uses:

It would be possible to turn this into syntax if we so desire (see #1937 for variants - including impure input).

However, it might also be best to start with them as annotations to more quickly get this into libraries in a tool-independent way, and for the moment only view restrictions as tool-warnings - and then once we have more experience we can modify the syntax and semantics (and state that the annotations behave in a similar way for backwards compatibility).

For pure functions the corresponding issue is thread-safety. I assume that should be a separate issue.

henrikt-ma commented 4 weeks ago

I agree that we need something more expressive than just pure/impure, and largely agree with the picture presented in the introductory comment above.

I think a lot of this can be solved by extending the function variability described in https://specification.modelica.org/master/operators-and-expressions.html#function-variability. That is, by allowing function variability to be declared rather than just inferred from default arguments, much of what we need would follow automatically. Similar to today, function variability could remain inferred, but now not only from default arguments, but also from the function variability of functions used in the body.

A key advantage of extending the function variability concept is that we get the semantic restrictions based on variability for free.

With an extended function variability concept, we could use pure/impure just to orthogonally express whether there are side-effects.

According to this idea, annotation(__Dymola_impureConstant=true) would instead be expressed like so:

impure constant function f
  …
end f;

I see a small syntactic challenge in that we don't have a variability keyword for the highest variability (non-discrete-time), while one would also like function variability to be implicit when no variability prefix is explicitly given. That is, to express annotation(__Dymola_impureRandom=true) one would need something different than the current selection of variability prefixes, for instance:

impure impure function f

or

impure "random" function f

or

impure false function f

(When choosing syntax, one should also consider what it will look like when combined with pure, which makes impure impure a poor candidate in my opinion, since it would be difficult to remember that pure impure is not the same thing as impure constant.)

Similar to pure/impure it should be deprecated from start to not specify function variability for an external function. For safety and increased backward compatibility, we could state that the default function variability of an impure external function is "random", while it might make more sense for a pure external function to be constant by default.

HansOlsson commented 4 weeks ago

I agree that we need something more expressive than just pure/impure, and largely agree with the picture presented in the introductory comment above.

I think a lot of this can be solved by extending the function variability described in https://specification.modelica.org/master/operators-and-expressions.html#function-variability. That is, by allowing function variability to be declared rather than just inferred from default arguments, much of what we need would follow automatically. Similar to today, function variability could remain inferred, but now not only from default arguments, but also from the function variability of functions used in the body.

I'm not convinced that it is really about variability. However, I have no problem with using impure constant, even though I think it is more a matter of input vs. output for those functions. (Thus it is more "const" in sort of the C/C++ member function sense than in the Modelica sense.)

A key advantage of extending the function variability concept is that we get the semantic restrictions based on variability for free.

With an extended function variability concept, we could use pure/impure just to orthogonally express whether there are side-effects.

According to this idea, annotation(__Dymola_impureConstant=true) would instead be expressed like so:

impure constant function f
  …
end f;

I see a small syntactic challenge in that we don't have a variability keyword for the highest variability (non-discrete-time), while one would also like function variability to be implicit when no variability prefix is explicitly given. That is, to express annotation(__Dymola_impureRandom=true) one would need something different than the current selection of variability prefixes, for instance:

In particular I think the variability arguments breaks down here, as I don't see that "impure random" being on the "top of hierarchy" since it doesn't actually impact other functions (as level 3).

Additionally, these functions don't really have side-effects (they depend on, but don't influence the external state), which would imply they are pure - but that seems even more confusing.

Similar to pure/impure it should be deprecated from start to not specify function variability for an external function. For safety and increased backward compatibility, we could state that the default function variability of an impure external function is "random", while it might make more sense for a pure external function to be constant by default.

I would more say that we keep external functions as impacting the external state. However, having the possibility to deduce these properties between functions would be helpful; the idea is that functions may in clever ways go down a level but not up (a function might use the random value in a way that removes the randomness - e.g., quick-sort of unique values internally using a random function to select the pivot; but a non-random function cannot become random).

However, this also indicates that more discussion is needed and we have to figure out how to introduce this in a good way. It may indicate that the annotations are useful as an intermediate solution, while we figure out the best way forward.

HansOlsson commented 3 weeks ago

Thinking more I realized that impureRandom isn't an ideal name.

The problem is that it differentiates it from impure constant, not from default impure - so impureNonWrite might be better, indicating that it doesn't write to the external environment (or impureOutput=false).

HansOlsson commented 2 weeks ago

@HansOlsson add two examples.

HansOlsson commented 2 weeks ago

For examples I will first present two example that I want to support in a good way:

First a corrected example based on https://github.com/modelica/ModelicaStandardLibrary/issues/4472

model checkExist
  Boolean Esiste(start = false);
equation 
  if (time > 0.0) then
    Modelica.Utilities.Streams.print(String(time),"file.txt");
  end if;
  when sample(0.1,0.1) then
    Esiste = Modelica.Utilities.Files.exist("file.txt");
  end when;
end checkExist;

And a second simplified example:

  model M
    parameter String tabFile;
    Modelica.Blocks.Sources.CombiTimeTable combiTimeTable(tableOnFile=Modelica.Utilities.Files.exist(tabFile),
      table=[0.0,0.0; 1,1; 2,4; 3,9],
      tableName="table",
      fileName=tabFile)
      annotation (Placement(transformation(extent={{-256,-42},{-236,-22}})));
  end M;
  parameter String tabFile="test.txt";
  M m[10](each tabFile=tabFile);

The intents of the examples are:

Unpredictable cases are e.g.:

  parameter String s;
  parameter Real A[:,:]=...;
  parameter Real B[:,:]=...;
  parameter Boolean ok1=Modelica.Utilities.Streams.writeRealMatrix(s, "A", A);
  parameter Boolean ok2=Modelica.Utilities.Streams.writeRealMatrix(s, "B", B, append=true);

The problem here is that there's no guarantee that the "first" write-call will be ordered before the "second", so maybe B is written first and then A is added without appending to the file and thus overwriting B (the intent was the other order). I don't have a good solution for how to handle this, but adding impureConstant will at least ensure that we can see whether this issue may occur or not.

I will not consider evaluable parameters in detail and think we can deal with that later. However, we can say that if we have to evaluate a parameter bound to a general impure function it becomes problematic (what about side-effects?). However, if it is impureConstant we can at least say that simulation is valid as long as the external environment wasn't changed.