modelica-tools / csv-compare

Tool to compare curves from one csv files with curves from other csv files using an adjustable tolerance
https://www.modelica.org
BSD 3-Clause "New" or "Revised" License
25 stars 15 forks source link

Avoid values of baseY less than 0.001 #67

Closed casella closed 2 weeks ago

casella commented 1 month ago

This change considers a minimum range of Y reference values of 0.001, to avoid false negatives when handling near-zero values that are the result of balance equations affected by small numerical errors. 0.001 is a reasonable compromise, since the nominal attribute (which would be better) is not available in CSV result files.

maltelenz commented 2 weeks ago

This seems to indicate that the SetFormerBaseAndRatio choice actually means something very specific. Maybe it shouldn't be modified, and a new type of ratio should be added instead? Alternatively, modify the name and description of what this one means. https://github.com/modelica-tools/csv-compare/blob/18b309dcd0b510abd2a911ed6fcd82245a80aa32/Modelica_ResultCompare/CurveCompare/TubeSize.cs#L88

casella commented 2 weeks ago

@maltelenz I'm not sure I understand the point of your last comment. According to the analysis by @beutlich, this is the function that is actually called to compute the parameters of the tolerance tubes. If we don't change that function, the tubes will remain too tight for near-zero variables.

As I said during the meeting, our current goal is not to restructure or refactor the CSV-compare source code, that would just take forever. Our immediate goal is just to avoid too narrow tubes when the variables are very close to zero when doing the current regression testing. Nothing more, nothing less. To my understanding, PR #66 just does that. Do you agree?

maltelenz commented 2 weeks ago

As I said during the meeting, our current goal is not to restructure or refactor the CSV-compare source code, that would just take forever.

One can try to make clean updates in the code that keeps the code understandable, or pay the ever increasing price of getting an increasingly unreadable code mess, making future understanding and fixes harder.

Of course, sometimes it may be worth it to make a quick hack, and ignore future work and cost.

Our immediate goal is just to avoid too narrow tubes when the variables are very close to zero. To my understanding, PR https://github.com/modelica-tools/csv-compare/issues/66 just does that. Do you agree?

I don't know, I haven't spent the time to understand and test the change.

casella commented 2 weeks ago

One can try to make clean updates in the code that keeps the code understandable, or pay the ever increasing price of getting an increasingly unreadable code mess, making future understanding and fixes harder.

Of course, sometimes it may be worth it to make a quick hack, and ignore future work and cost.

This update simply swaps an epsilon of 1e-12 with an epsilon of 1e-3, which is more appropriate. I can't see how this could cause any future extra work or cost. Conversely, if we don't do this, we're stuck forever with regressions of the MSL 4.1.0. We need to move on with that.

maltelenz commented 2 weeks ago

I can't see how this could cause any future extra work or cost.

The line of code I quoted in my first comment, says what the FormerBaseAndRatio is intended to do. This PR changes that behavior, but doesn't change the description. That means that the behavior of the code will deviate from the described behavior of the code. To me, this sounds like the next person that comes and reads the code will get the wrong impression of what is going on.

Conversely, if we don't do this, we're stuck forever with regressions of the MSL 4.1.0. We need to move on with that.

I'm not against changing things, I just thought I could share what I found about the FormerBaseAndRatio while quickly browsing the code. There seems to be some confusion about what it means already in this PR and the linked issue.

casella commented 2 weeks ago

I did some more analysis. The functions which calls either SetFormerBaseAndRatio() or SetStandardBaseAndRatio() are the constructors of TubeSize:

https://github.com/modelica-tools/csv-compare/blob/18b309dcd0b510abd2a911ed6fcd82245a80aa32/Modelica_ResultCompare/CurveCompare/TubeSize.cs#L77-L82

https://github.com/modelica-tools/csv-compare/blob/18b309dcd0b510abd2a911ed6fcd82245a80aa32/Modelica_ResultCompare/CurveCompare/TubeSize.cs#L90-L98

I searched the entire code base for calls to such constructors, and I only found two of them, one here:

https://github.com/modelica-tools/csv-compare/blob/18b309dcd0b510abd2a911ed6fcd82245a80aa32/Modelica_ResultCompare/CurveCompare/CurveCompare.cs#L65

and one here

https://github.com/modelica-tools/csv-compare/blob/18b309dcd0b510abd2a911ed6fcd82245a80aa32/Modelica_ResultCompare/CsvFile.cs#L316

The first constructor calls SetStandardBaseAndRatio(), the second constructor is only called with formerBaseAndRatio=true, so it only calls SetFormerBaseAndRatio(). To play safe, we can tweak both functions, just in case. I've updated the PR accordingly

casella commented 2 weeks ago

@sjoelund, @beutlich during today's MAP-LIB meeting there was agreement to go on with this change, so that @GallLeo can run the regression testing without false negative.

Unfortunately, nobody in the (virtual) room had write access to the repo. Can you please merge this PR onto master?

Thanks!

beutlich commented 2 weeks ago

The first constructor calls SetStandardBaseAndRatio(), the second constructor is only called with formerBaseAndRatio=true, so it only calls SetFormerBaseAndRatio().

CurveCompare/CurveCompare.cs itself seems nowhere referenced. See #69.

casella commented 2 weeks ago

@casella Thanks for the PR. Let me know if you are happy with my refactoring applied on top of your initial commit.

LGTM. Please merge it onto master