firemodels / fds

Fire Dynamics Simulator
https://pages.nist.gov/fds-smv/
Other
664 stars 624 forks source link

Different DEVC time histories from same input file #5161

Closed rmcdermo closed 7 years ago

rmcdermo commented 7 years ago

Users are reporting different results for two different runs using the attached input file. The discussion forum thread is given here.

I want to make sure this is not the result of arrays that are not initialized.

teste.fds.txt

mcgratta commented 7 years ago

I ran this case with the Intel Thread Checker and it produced no errors, or at least I did not see evidence of any. However, when I run the case with 1 MPI process and 1 OpenMP thread, I get different results than if I run the case with 1 MPI process and 2 OpenMP threads. So it does not appear to be a problem with MPI. However, I cannot yet pinpoint what is causing the problem in OpenMP.

lu-kas commented 7 years ago

sry for the shortness, i'm on the move

there is a potential openmp bug with intel fortran 2017, could you please check with intel 2017 and gcc?

http://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JURECA/UserInfo/KnownIssues.html

intel is looking into this already

best, lukas

On 25. Jun 2017, at 21:03, Kevin McGrattan notifications@github.com wrote:

I ran this case with the Intel Thread Checker and it produced no errors, or at least I did not see evidence of any. However, when I run the case with 1 MPI process and 1 OpenMP thread, I get different results than if I run the case with 1 MPI process and 2 OpenMP threads. So it does not appear to be a problem with MPI. However, I cannot yet pinpoint what is causing the problem in OpenMP.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

mcgratta commented 7 years ago

I ran the case with the Gnu Fortran compiler, with 1 OpenMP thread and then with 6. I get similar divergent results after about 5 s. So I think we have a flaw in some OpenMP directive in FDS, but so far the Intel Thread Checker has not found it.

mcgratta commented 7 years ago

More news. I ran the case with Intel Fortran 17 using -O1 optimization. The divergence in results does not appear. Using -O2, it does. The Intel Thread Checker runs with -O0, and it did not detect a problem.

Sigh, these are the worst kinds of bugs.

mcgratta commented 7 years ago

I think I found and fixed the problem, #5192. I'll leave this case open until it passes firebot and I add the case to the verification suite.