Clarifying pre for arrays

HansOlsson commented 4 years ago

The specification says that the argument to premust be a variable, I just wanted to clarify that it can only be a variable and in particular that a component reference is not allowed.

Obviously pre(x[1]) would be trivial to handle as pre(x)[1] (using generalized indexing to explain the result), but then we have pre(x[i]) which in general should be pre(x)[pre(i)] and finally pre(x[i+j]) to pre(x)[pre(i)+pre(j)] and the latter would be weird when we don't allow pre(i+j).

One reason we don't allow general expressions in pre is that it messes up count of initial conditions, thus some uses of component reference was removed due to that without realizing this underlying issue - see https://github.com/modelica/ModelicaStandardLibrary/issues/1016

(I thought we had a previous issue on this, but I couldn't find it.)

sjoelund commented 4 years ago

The following is a list of libraries (including MSL) that use pre as component reference with subscripts: pre-with-subscript.txt

sjoelund commented 4 years ago

Note: I suspect that all those cases use constant expressions (or considered structural parameters)

HansOlsson commented 4 years ago

Note: I suspect that all those cases use constant expressions (or considered structural parameters)

Most do, but not all:

build/lib/omlibrary/Modelica 2.2.2/Electrical/Digital.mo:pre(auxiliary[n])
build/lib/omlibrary/PowerSystems 1.0.0/Control/Modulation.mo:pre(phiIgn[k, pre(n[k])

To clarify the latter: k is loop-index and thus constant, but n is discrete.

gkurzbach commented 4 years ago

Obviously pre(x[1]) would be trivial to handle as pre(x)[1] (using generalized indexing to explain the result), but then we have pre(x[i]) which in general should be pre(x)[pre(i)] and finally pre(x[i+j]) to pre(x)[pre(i)+pre(j)] and the latter would be weird when we don't allow pre(i+j).

Wouldn't it it be possible to handle pre(x[i]) always as pre(x)[i]? Then, there is no need to forbid this and for me, this is what the users usually expect.

HansOlsson commented 4 years ago

Wouldn't it it be possible to handle pre(x[i]) always as pre(x)[i]? Then, there is no need to forbid this and for me, this is what the users usually expect.

It would be a possibility, even if a bit odd to me. Obviously we have to make it clear.

HansOlsson commented 4 years ago

Starting a poll regarding pre(x[i]) for non-parametric i. Select one or more of the 4 emojis below as response before May 20th.

The simple way to make a poll among different alternatives on GitHub seems to be go to the right-corner of this post and select one (or more) of the reactions as follows:

Reactions

The possibilities are:

Forbid, i.e. require that the subscript is parameter expression. :tada: (called hooray)
Handle as pre(x)[pre(i)]; see below. :heart:
Introduce y=x[i], and use pre(y). I'm not sure if that the result is identical to the previous alternative. This manual rewriting has been used in some cases. :rocket:
View as pre(x)[i]. If people want pre(x)[pre(i)] they can write that manually (as is already the case in some libraries). :eyes:

Some comments: It seems we have a number of libraries relying on variant 4, as it seems that Dymola by mistake implemented this in some cases.

Option 2 would need a lot more work in order to handle general expressions, and not just variables as subscripts - and in those cases it will likely introduce too many initial conditions for the subscripts (e.g. for pre(x[i+j]) we would need start-value for both i and j; even though we only need one.)

Additionally a downside with options 2 and 4 are that they may introduce unneeded initial conditions for x (since we need start-values for the entire x-vector).

The downside with option 3 is that it does introduce extra initial conditions that cannot be set (start-value for y) - which is very problematic. When using option 3 manually that downside is gone, but you have an extra variable.

Option 1 is the most restrictive.

kabdelhak commented 4 years ago

I actually think that the expression has to be taken literally and expressions inside should not inherit the pre operator. I expect from pre (as well as from der) that the inside of that call is a state (discrete or non discrete). So the auxiliary approach looks correct to me. Also the statement from 4. has to be applied here. If someone wants pre(x[pre(i)]), they have to write that manually. Also there would not be a way of expressing pre(x[i]), since it would be converted in the other approaches.

In that matter i do not understand why expressions are allowed inside der but not inside pre. With the auxiliary approach that would not be a problem.

The additional initial conditions should not pose a problem for compilers, since they have a very simple structure and can be stripped (and checked for consistency) fairly easy from what i can see.

HansOlsson commented 4 years ago

I actually think that the expression has to be taken literally and expressions inside should not inherit the pre operator. I expect from pre (as well as from der) that the inside of that call is a state (discrete or non discrete). So the auxiliary approach looks correct to me. Also the statement from 4. has to be applied here. If someone wants pre(x[pre(i)]), they have to write that manually. Also there would not be a way of expressing pre(x[i]), since it would be converted in the other approaches.

I understand the point, but since we write pre(x[i]) instead of pre(x)[i] it's not clear to me that this is the only choice.

In that matter i do not understand why expressions are allowed inside der but not inside pre. With the auxiliary approach that would not be a problem.

The additional initial conditions should not pose a problem for compilers, since they have a very simple structure and can be stripped (and checked for consistency) fairly easy from what i can see.

The initial conditions cannot be stripped and don't have to be consistent. The problem is that even if an event always sets both x:=z and y:=z we cannot conclude/require that pre(x)=pre(y), since nothing is specified before that first event. For derivatives we can conclude that x=y leads to der(x)=der(y), but der(x)=der(y) only gives that x=y+"constant". (The "constant" is constant during the simulation, except if reinit is used.)

For synchronous elements it's different as variables have no value when the clocks don't tick.

casella commented 4 years ago

I'd like to ask for a clarification, which also holds for der. Is it possible to define an array but then only write equations so that only some elements are actually state variables? E.g.:

model foo
  Real x[2], v[2];
equation
  der(x[1]) = -x[1] + x[2];
  x[2] = sin(time);
  when sample(0,1) then
    v[1] = pre(v[1]) + 1;
    v[2] = time;
  end when;
end foo;

I'd say this is probably not good modelling practice, and could make the implementation of an efficient array-handling compiler a nightmare, but what does the Specification say about it?

henrikt-ma commented 4 years ago

In option 3, would it be more clear to say the following?

Handle as pre(y) where y is an auxiliary variable defined as y = x[i].

It would make it more clear that it isn't the user who is supposed to introduce y (which wouldn't make sense in view of then being the same as option 1, and with clear possibility to set start value).

HansOlsson commented 4 years ago

In option 3, would it be more clear to say the following?

Handle as pre(y) where y is an auxiliary variable defined as y = x[i].

It would make it more clear that it isn't the user who is supposed to introduce y (which wouldn't make sense in view of then being the same as option 1, and with clear possibility to set start value).

True, that is clearer and was the intent for option 3.

phannebohm commented 4 years ago

(slightly off topic)

Is it possible to define an array but then only write equations so that only some elements are actually state variables?

I'd say this [...] could make the implementation of an efficient array-handling compiler a nightmare, [...]

I'm not 100% certain, but in the long run this may not complicate things as much as you think. Since we might need to match different parts of arrays to different equations, we should consider slicing the arrays into independent sub-arrays accordingly, anyway. Having only part of the array be a state shouldn't be a problem, if we do preliminary slicing to separate states.

However, it probably would get extremely weird, when the decision which part is a state becomes time dependent...

phannebohm commented 4 years ago

Wouldn't it it be possible to handle pre(x[i]) always as pre(x)[i]? Then, there is no need to forbid this and for me, this is what the users usually expect.

I'm not an experienced modeller, but this is not what I expected when I first thought about it. Also I would expect 2 and 3 to have the same effect rather than 4 and 3. So maybe this is less about intuition and more about practicality. Forcing people to explicitly formulate their intent may be the cleanest option.

HansOlsson commented 4 years ago

In either formulation for option 3 it is not clear to me how pre(y) is then treated. Is it the same as option 2 or the same as option 4. It can be interpreted differently. I think this option should not be used without further clarification how to interpret it.

There is no difference in handling pre(y) in the proposals for a variable y, the difference is only for pre(y[i]) when there is subscripting - especially when i is time-varying. So the idea of option 3 is to automatically introduce a new auxiliary variable y=x[i] and view pre(x[i]) as pre(y).

gkurzbach commented 4 years ago

Let's consider the question where the i in x[i] comes from. Often the components of an array are accessed in a loop like this:

when sample(...) then
  for i in 1:size(x,1) loop
    x[i]=pre(x[i])+i;
  end loop;
end when;

In this case it makes no sense to have pre(i) (variant 2) because because i is a loop variable and changes at every cycle. Also replacing x[i]by y with y=x[i] (variant 3) makes no sense for the same reason and is only feasible if the loop range is really known at compile time. Applying pre() to such a variable should lead to an error and should not be done automatically. These cases should be handled as pre(x)[i]. There is no need to forbid this (variant 1). So I would apply this rule also for the general case and if someone really wants to have pre(x[pre(i)]) he can write it down. Also there is no need to generate an additional variable, at least if we not allow general expression inside pre(). Even in that case it works: pre(x[i]+i)==pre(x)[i]+i if pre() is not automatically applied to loop variables.

henrikt-ma commented 4 years ago

Yes, I think we can agree that pre(i) at least doesn't have the meaning of accessing an earlier value when i is a loop variable. To me, however, this suggests that pre should be a no-op for this i, just as pre should be a no-op when applied to parameters and constant expressions. This would allow interpretation of pre on general expressions by just propagating it down the expression tree until it hits a variable for which it isn't a no-op.

The result of this basic rule would be that pre(x[i]) is the same as:

pre(x)[i] when i is a loop variable, parameter, constant, or anything else for which pre(i) is a no-op.
pre(x)[pre(i)] when i any kind of variable for which pre(i) isn't a no-op.

I voted for option 1 since it is the only option that clearly wouldn't conflict with this kind of generalization of pre to general expressions in the future.

christoff-buerger commented 4 years ago

Inspired by what @casella and @gkurzbach said, I like to highlight the need for consistency of such a proposal.

Consider the current behavior of der:

model M1
  Real x1, y1, z1;
  Real x2, y2, z2;
  Real x3, h;
equation 
  x1 = der(y1 * z1);
  y1 = sin(time);
  z1 = cos(time);

  x2 = y2 * z2;
  y2 = der(y1);
  z2 = der(z1);

  x3 = der(h);
  h = y1 * z1;
end M1;

Obviously, der(y * z) is something different than der(y) * der(z), it is also not der(der(y) * der(z)). A der(expr) can be interpreted as der(h) whereas h =expr (modulo the special handling of initialization, but we can just assume h to have no special initialization semantic nor value).

If it comes to indices however, and computations on such, things are slightly different. The meaning of:

model M2
  Real x[10];
equation 
  for i in 1:9 loop
    x[i] = der(x[i+1]);
  end for;
  x[10] = sin(time);
end M2;

is relatively clear. It is not der(x)[der(i+1)], but rather der(x)[i + 1] and der(h) with h = x[i + 1] (due to vectorization both are identical). Both -- der(x)[i + 1] and der(h) with h = x[i + 1] -- would not in general be identical however, if, and only if, we would permit runtime value dependent indexing. To make this point clear, consider the following model:

model M
  Real x[10];
  Real y;

  Real h1;
  Real y1;

  Real h2[10];
  Real y2;

equation 
  y = der(x[integer(mod(time, 10) + 1)]);
  for i in 1:9 loop
    x[i] = der(x[i + 1]);
  end for;
  x[10] = sin(time);

  h1 = x[integer(mod(time, 10) + 1)];
  y1 = der(h1);

  h2 = der(x);
  y2 = h2[integer(mod(time, 10) + 1)];
end M;

Such runtime dependent indexing should be strictly forbidden in my opinion. If that is the case, then as I already said before, der(x)[i + 1] and der(h) with h = x[i + 1] are identical for a vector x because of the implicit vectorization semantic of Modelica. It does not matter if we first compute the derivatives of all elements of a vector and then index or if we first index and thereafter compute the derivative of the indexed element.

Thus, indexing should be independent of runtime values; this implies that der(x)[der(i)] makes absolutely no sense. What should be the derivative of a non-runtime dependent value? It is like the derivative of a constant, and that would always just be 0. Like der, pre as an operator works on runtime values; pre on indexes makes therefore as little sense as der on such. A phrase of the form pre(x)[pre(i)] is rubbish in my opinion. What should that mean? The previous index used (a runtime value it must not be)? But what should the previous index be? The last one that happened to be used, i - 1? Makes no sense. It only gets meaning if i has a runtime/simulation dependency, but then indexing is not safe.

I hope we can define pre and der in a consistent way. Because for me it feels strange if different operators have different application rules, like foo(x + y) means foo(x) + foo(y) but bar(x + y) means bar(h) with h = x + y.

kabdelhak commented 4 years ago

Inspired by what @casella and @gkurzbach said, I like to highlight the need for consistency of such a proposal.

Consider the current behavior of der:
model M1
  Real x1, y1, z1;
  Real x2, y2, z2;
  Real x3, h;
equation 
  x1 = der(y1 * z1);
  y1 = sin(time);
  z1 = cos(time);

  x2 = y2 * z2;
  y2 = der(y1);
  z2 = der(z1);

  x3 = der(h);
  h = y1 * z1;
end M1;
Obviously, der(y * z) is something different than der(y) * der(z), it is also not der(der(y) * der(z)). A der(expr) can be interpreted as der(h) whereas h =expr (modulo the special handling of initialization, but we can just assume h to have no special initialization semantic nor value).

As I highlighted in a discussion on the OpenModelica Trac we believe that a derivative call on a non component reference should always be replaced with an auxiliary: https://trac.openmodelica.org/OpenModelica/ticket/5934

This is far more robust for index reduction and produces smaller and more predictable state handling as far as we know. I think this should be the way to go in general and maybe that also applies to pre calls. You can find further information on the ticket.

HansOlsson commented 4 years ago

Result when closing poll:

Forbid, i.e. require that the subscript is parameter expression. 🎉 (called hooray), 12 votes
Handle as pre(x)[pre(i)]; see below. ❤️, 1 vote
Automatically introduce an extra variable y with y=x[i], and use pre(y). I'm not sure if that the result is identical to the previous alternative. This manual rewriting has been used in some cases. 🚀 3 votes
View as pre(x)[i]. If people want pre(x)[pre(i)] they can write that manually (as is already the case in some libraries). 👀 6 votes

The first option (forbid) is the clear winner.

I noticed two issues with the voting itself that we need to improve for the future, but the result was clear enough: I don't recognize all the names (as some have odd names in GitHub - at meeting we normally allow guests to participate but that is less clear here), and you can only see the first ten persons for each alternative. Using some doodle alternative might be preferable in the future.

modelica / ModelicaSpecification

Clarifying pre for arrays #2556