MFlowCode / MFC

Exascale simulation of multiphase/physics fluid dynamics
https://mflowcode.github.io
MIT License
137 stars 60 forks source link

[update] refactor advective source terms [old] Major Refactor of s_compute_rhs #330

Closed wilfonba closed 4 months ago

wilfonba commented 6 months ago

For @henryleberre

wilfonba commented 6 months ago

@sbryngelson @henryleberre This randomly failed two tests but passed most of them. There's potential that this one failed test actually indicates an error in the code, but it's odd that so many of them passed without issue. Any thoughts?

sbryngelson commented 6 months ago

The failures were on the Intel compilers. You could have modified the actual code meaning, or just the compiled code. It seems like the latter is more likely. I re-ran the jobs but I would double check your changes and why its failing certain specific tests.

wilfonba commented 6 months ago

As far as I can tell, all of the logic and such in the code is the same as it was before. I'm guessing this is some sort of problem with Intel compilers and the inlining of subroutines or something like that.

sbryngelson commented 6 months ago

Since it passes the intel compiler debug CI I suspect that you're right and it has to do with some attempted compiler optimizations. Please see if raising the tolerance to 5e-12 instead of 1e-12 is sufficient for this.

sbryngelson commented 6 months ago

Since this is failing quickly on simple cases in 1D, you could find out exactly what change to the source code is ultimately causing the failure via bisecting your made changes. I think this might be useful for learning more about what is causing differences between other compilers as well.

wilfonba commented 6 months ago

Okay

wilfonba commented 6 months ago

@sbryngelson I reverted the refactoring of the volume fraction source term, which appears to fix the problem. Do you want to leave it this way, or simply acknowledge that this change breaks Intel compilers and remove the non-debug runners in the time being since no one actually uses Intel compilers? Also, I'm assuming I should revert the change to the test tolerance.

sbryngelson commented 6 months ago

@sbryngelson I reverted the refactoring of the volume fraction source term, which appears to fix the problem. Do you want to leave it this way, or simply acknowledge that this change breaks Intel compilers and remove the non-debug runners in the time being since no one actually uses Intel compilers? Also, I'm assuming I should revert the change to the test tolerance.

@wilfonba well that was quick, nice, thanks! My preference is to refactor everything out, with the secondary goal of having some idea of why this change caused a problem (perhaps giving some insight into other compiler mismatches). Can you copy here the relevant change that breaks things?

wilfonba commented 6 months ago

The relevant change can be seen in the git diff for the commit "reverse refactor of advection source term". The second "fix bug" commit was just me forgetting to change the flux_src_n_vf()... variables to flux_src_n(id)%vf()... after moving the refactored code back s_compute_rhs(). It was relatively easy to find because the failing test case was the first riemann_solver == 1 case, which is only used in the refactored code for the advection source. It's odd that the simplest abstraction caused the problem.

sbryngelson commented 4 months ago

fixed by https://github.com/MFlowCode/MFC/pull/382