precice / systemtests

Testing preCICE / solver combinations using Docker
GNU General Public License v3.0
3 stars 4 forks source link

Add a summary of results difference in the main log #276

Open davidscn opened 3 years ago

davidscn commented 3 years ago

A bit of a consequence of https://github.com/precice/openfoam-adapter/issues/149

Example for the deal.II-OpenFOAM tutorial testing: We currently compare around 500 preCICE export vtk files for this testcase. I don*t understand the purpose of this huge amount of reference data. In general, one file (ususally the last time step) would be enough in order to verify the correctness of the testcase. Doing so would also allow something like this:

print_result() {
        if [ $? -eq 0 ]
        then
            echo -ne "Test passed \n"
        else
            echo -ne "Failed: \n"
            cat diff.log
        fi

so that we could observe any failing differences directly in the travis log files and don*t need to run these things locally.

I would also like to discuss the location of the reference results: in my opinion the references should be hosted in the tutorial repository, so that any changes in the tutorials itself (e.g. the restructuring) require and enable an update of the references within the same PR. This would make it much more transparent for me. Similar to a regular testing infrastructure, we would have a separate directory called tests in the tutorial repository containing all the references.

MakisH commented 3 years ago

I like the general idea, but I would be a bit hesitant to only compare the last files: I assume that in transient problems, this could hide issues in the beginning of the simulation, which then simply "fade out". What would we gain by comparing files for one time instead of files for all the times? We could always apply the same (very nice) suggestion but only on the first set of files differing.

A bit of history regarding this: in the beginning, we were comparing everything, including the internal domain of each participant. In October 2020 (see https://github.com/precice/systemtests/issues/252 and https://github.com/precice/systemtests/pull/256), we decided to only compare the exported VTK files of preCICE, mainly to make the maintenance of comparison scripts easier. A seminar paper (internal) this semester showed that this is probably indeed a safe simplification to make.

I would also like to discuss the location of the reference results: in my opinion the references should be hosted in the tutorial repository, so that any changes in the tutorials itself (e.g. the restructuring) require and enable an update of the references within the same PR. This would make it much more transparent for me. Similar to a regular testing infrastructure, we would have a separate directory called tests in the tutorial repository containing all the references.

This is already planned and described in the tutorials restructuring project:

In the future, we would also like to store reference results with each tutorial case, which could be used for the system tests. The vision is that in the future we only have tools in the systemtests repository, not reference results or case-specific patches.

But we currently have too many boxes open that depend on each other, so let's finish the restructuring first. I opened a dedicated issue to not forget https://github.com/precice/systemtests/issues/277

So: with the above in mind, how can we shape this issue in a single actionable item? :smiley:

davidscn commented 3 years ago

I like the general idea, but I would be a bit hesitant to only compare the last files: I assume that in transient problems, this could hide issues in the beginning of the simulation, which then simply "fade out". What would we gain by comparing files for one time instead of files for all the times? We could always apply the same (very nice) suggestion but only on the first set of files differing.

I think it is rather unlikely that an error fades out up to machine accuracy (the tests don't even run till the end). The tests are supposed to proof the development of the adapters and preCICE and not settings in the tutorials (?). The advantage would be from my perspective a better overview of what we currently test, where to look when something fails and how to update reference data properly. We can of course also print only the diff of the first file.

But we currently have too many boxes open that depend on each other, so let's finish the restructuring first. I opened a dedicated issue to not forget #277

So: with the above in mind, how can we shape this issue in a single actionable item? :smiley:

Thanks. Ok then let's finish the restructuring first. A doable task would be to incorporate the diff directly into the log files.

MakisH commented 3 years ago

We can of course also print only the diff of the first file.

I would follow this approach, which should be easy to implement and does not introduce more assumptions than we already have. I renamed the issue, I hope I captured the needed action and there is nothing else here we will forget about. Please correct me now if needed.