Improvements to the CSV compare tool for zero reference signals.

casella commented 2 weeks ago

While reviewing regressions from MSL 4.0.0 to MSL 4.1.0 (#4341) with @tobolar, we stumbled upon several reference signals which were very close to zero, but not exactly at zero due to numerical approximation. In some cases (e.g. ModelicaTest.MultiBody.Joints.Universal.diff.universal.w_a) the variable, which has ideally an equilibrium trajectory at constant zero value, is actually unstable, so any small numerical error grows exponentially.

For non-zero variables, numerical errors are normally not a problem, because their relative magnitude is small, so they easily fall within the tolerance tube. Unfortunately, this is not the case for variables that are meant to be zero. In fact, the more precise the solution is, the more absurdly small will the high and low limits will be. We had regressions from MSL 4.0.0 to MSL 4.1.0 because of variables that were off by 1.3e-16 😨

IMHO there are two proper solutions to this problem.

The first could be to somehow include the nominal attribute of each variable in the CSV data. This would allow to check the error with a criterion such as

(v - v_ref)/max(abs(v), v.nominal) < tol

which will be robust also for variables that are ideally zero, since then the relative scaling will be performed with their nominal value.

The second is to take into account the fact that in most (but not all) cases these variables are parts of a vector, e.g. the three Cartesian coordinates of a position, velocity, or acceleration signal, the components of a quaternion, or the Real and Imaginary parts of a Complex number. In this case, the error should of course be evaluated by looking at the entire vector, e.g.

(v[j] - v_ref[j])/max(sqrt(sum(v[j] for j in 1:size(v))), v.nominal) < tol

for 3D vectors or

(v.re - v_ref.re)/max(sqrt(v.re^2 + v.im^2), v.nominal) < tol

Unfortunately, there is no clear Modelica semantics to declare that an array is a 3D vector or a quaternion, or that a Complex operator record is actually a vector in the Gauss plane. Maybe we need some annotations in the language to express that unambiguously? This could be added to the base classes in MSL, without the need of adding this information all the time in user models.

Until these issues are resolved, I am convinced that we should avoid to include reference signals that are expected to be exactly zero i the analytic solution, because their are basically just misleading. Of course you can reproduce them with the same tool, but even different versions of the same tool may produce completely different results, not to mention comparing different tools that use different numerical methods. We should avoid wasting time over "fake regressions" on such variables.

What do you think?

AHaumer commented 2 weeks ago

I think @casella is right but it could be tedious to define nominal values for all reference signals manually. As far as I know all tools should determine some sort of nominal values to calculate integration errors to determine proper step size of integration algorithms with variable step size. It's just a question how to access these nominal values to utilize them ... @HansOlsson any idea?

casella commented 2 weeks ago

@AHaumer I meant the nominal attributes. They are often defined in basic SI types (e.g. Pressure has nominal = 1e6)

HansOlsson commented 1 week ago

A simple and pragmatic solution would be to assume that all of the variables have a nominal of at least 1e-3, and update the test-program based on that. We might miss some cases that should have a tighter tolerance - but we can deal with that later - without running the risk of accidentally removing those signals in this step.

HansOlsson commented 1 week ago

I think @casella is right but it could be tedious to define nominal values for all reference signals manually. As far as I know all tools should determine some sort of nominal values to calculate integration errors to determine proper step size of integration algorithms with variable step size. It's just a question how to access these nominal values to utilize them ... @HansOlsson any idea?

After additional thought I think this will be quite complicated, whereas just updating the tool to use a fixed nominal value (of 1e-3 or just 1) will be straightforward and solve the problem. Note that due to how nominals should work it will only be a relevant for variables whose nominal is lower than this fixed value. Pressures with a nominal of 1e5 will work as well (or badly) as before.

The reasons are due to the internals:

The comparison is based on result-files converted to csv-files; and there is no obvious place to store the nominal values in the CSV-files.
- Storing it in the header would be the most obvious, but would mess up other uses of the CSV-files.
- Storing it as a value with fake-time would "work", but would also mess it up in other cases.
Storing it in the comparisonSignals.txt as a second column would have the additional problems that the information isn't directly used by the csv-compare as far as I understand, and it would also be problematic to update
The result files (at least from Dymola) don't currently contain this information so it cannot be copied to the csv-files. It would be possible to add it, but with similar issues.

henrikt-ma commented 1 week ago

The second is to take into account the fact that in most (but not all) cases these variables are parts of a vector, e.g. the three Cartesian coordinates of a position, velocity, or acceleration signal, the components of a quaternion, or the Real and Imaginary parts of a Complex number. In this case, the error should of course be evaluated by looking at the entire vector, e.g.

For what it's worth, this is a feature we use for our own correctness comparisons. However, we don't have any automatic mechanism for identifying when the feature should be used, so we apply it manually as needed.

A closely related feature is that when the unit quaternion is actually marked as a parameterization of SO(3), we can account for the quotient space effect of q and -q representing the same orientation. In practice, we rarely need this feature, but it addresses a fundamental problem when initializing orientations from complicated nonlinear equation systems where it is hard (or not desirable) to control exactly which of the two equivalent quaternions to expect.

casella commented 4 days ago

A simple and pragmatic solution would be to assume that all of the variables have a nominal of at least 1e-3, and update the test-program based on that. We might miss some cases that should have a tighter tolerance - but we can deal with that later - without running the risk of accidentally removing those signals in this step.

It's not the perfect solution but I see it as a reasonable one.

@HansOlsson the question is: who can take care of this and how long would that take? Can we make it in time for MSL 4.1.0 without delaying it unnecessarily?

modelica / ModelicaStandardLibrary

Improvements to the CSV compare tool for zero reference signals. #4421