su2code / SU2

SU2: An Open-Source Suite for Multiphysics Simulation and Design
https://su2code.github.io
Other
1.33k stars 839 forks source link

Non deterministic convergence history using NEWTON-KRYLOV with OMP #2259

Closed tbellosta closed 1 month ago

tbellosta commented 6 months ago

Convergence history is non deterministic when running the RANS solver with SA model using OMP-only parallelism. When fixing the min RMS residual, the computation stops at a different iteration each time I run the solver.

That does not happen with the EULER or NAVIER_STOKES solvers.

Convergence history is non deterministic when using the NEWTON_KRYLOV solver using OMP-only parallelism. When fixing the min RMS residual, the computation stops at a different iteration each time I run the solver. That happens with the EULER or NAVIER_STOKES and RANS solvers.

I am using a common setup for each case i.e. ROE+MUSCL convective fluxes, WLSQ for viscous gradients, LSQ for MUSCL gradients. Tried also switching ON and OFF the CFL adaptation and limiters, different linear solvers, and precision (single/double) of the linear solver but that doesn't change the outcome of the test (RANS conv history remains non deterministic while EULER and NS have always the same convergence history).

I compiled in release (-O3), with OMP support and exported the mtune/march variables to enable SIMD types.

I'll be looking in the SA/Scalar solver for some uninitialized vars. Any other idea?

I'll be looking in the CNewtonIntegration files for some uninitialized vars/thread safety issue. Any other idea?

Bug report checklist

Desktop (please complete the following information):

tbellosta commented 6 months ago

SCRATCH THAT

convergence history is non-deterministic when using the NEWTON-KRYLOV solver, regardless of the equation set. Using the standard linear solver, the RANS-EULER-NS solvers always converge in the same number of iterations. When switching on the NEWTON-KRYLOV option, the solver converges in a different number of iterations each time I run the solver.

pcarruscag commented 6 months ago

Certain parts of the code use atomic reductions, and dynamic work sharing for load balancing. That's why there is slight (hopefully) non determinism. There are more operations per iteration when using the NK solver, which makes it more noticeable.