Add hint to use multiplication and not addition for time computation

andreas-junghanns commented 2 years ago

We removed the following sentence in the standard from non-normative text at the entirely wrong place (4.4 description of fixedInternalStepSize. But this should be mentioned somewhere: The co-simulation algorithm should calculate the communication points by multiplying (number_of_steps * step_size) instead of repeatedly incrementing (time += step_size) to avoid the accumulation of numerical errors

pmai commented 2 years ago

Actually I don't think it should be mentioned somewhere, at least not in its present form: Without a longish explanation on the numerical analysis needed to come to any sane conclusion on how to manage time - this very much depends on the specific step size and simulation duration, among others - multiplication is potentially just as error-prone for numerical errors than addition is; and people will then more easily get the communication step sizes wrong, since they actually must match the previous communication time.

So doing this topic justice would require writing half a monograph on simulation design from a numerical point of view.

If anything using the multiplication trick naively usually results in breaking the equality of tNextCommunicationPoint = tCurrentCommunicationPoint + tCurrentCommunicationStep, or, if done right, makes the tCurrentCommunicationStep slightly non-constant, which can break certain FMUs.

I'd add that the reason people keep using the trick is not really for actual numerical error accumulation (in the sense that the LSB round-off error of IEEE754 accumulates subtly) but rather because they find the perceived numerical error due to the use of an inexact approximation of a non-representable step size unpleasing to the eye.

HansOlsson commented 2 years ago

Actually I don't think it should be mentioned somewhere, at least not in its present form: Without a longish explanation on the numerical analysis needed to come to any sane conclusion on how to manage time - this very much depends on the specific step size and simulation duration, among others - multiplication is potentially just as error-prone for numerical errors than addition is; and people will then more easily get the communication step sizes wrong, since they actually must match the previous communication time.

Multiplication in itself has similar error issues as addition, but the point is that we are not comparing one multiplication with one addition, but the error in one multiplication compared to the accumulated error of n additions, which clearly can be significantly larger. It is also more difficult to analyze the accumulated error, whereas it is straightforward to give strict error bounds for one multiplication.

If anything using the multiplication trick naively usually results in breaking the equality of tNextCommunicationPoint = tCurrentCommunicationPoint + tCurrentCommunicationStep, or, if done right, makes the tCurrentCommunicationStep slightly non-constant, which can break certain FMUs.

That is exactly why we need to mention that multiplication should be used, so that FMUs don't make those assumptions.

Note that it is possible to reduce the accumulated error in other ways than multiplication, but they will also break those assumptions.

I'd add that the reason people keep using the trick is not really for actual numerical error accumulation (in the sense that the LSB round-off error of IEEE754 accumulates subtly) but rather because they find the perceived numerical error due to the use of an inexact approximation of a non-representable step size unpleasing to the eye.

If the step-size is 0.001 and we take a million step the result is 1000-1.6e-8, and for ten million steps it is 10000+1.5e-6; so we have accumulated an error that is 0.15% of a step. To me that is more than just unpleasing to the eye in my eyes.

Added: For 100 million steps the error is 2.6% of a step and for a billion steps the error is 1800% of a step (18 steps) The corresponding error in multiplication in the latter case is about 0.00002%.

That is assuming we use IEEE754 double precision, with single or half precision time will just stop; which is important if we consider targeting such environments with e.g., eFMI and want the guide to be relevant for those cases.

chrbertsch commented 1 month ago

FMI Design Webmeeting:

Karl: the FMU should simulate until the sum of the provided time + timestep Klaus: I think the standard is clear, but there seems to be no agreement with implementers ... We should put this on the agenda of a next meeting in advanced. --> Homework for everyone.

modelica / fmi-guides

Add hint to use multiplication and not addition for time computation #24