Closed andrewsilver1997 closed 3 months ago
okay, just to be clear, this is in MAESTROeX and not MAESTRO, correct?
It does appear that we are missing some OMP reductions in the diag code.
yes, it's MAESTROeX. So it's unfixable at this point?
On Wed, May 15, 2024 at 8:04 PM Michael Zingale @.***> wrote:
okay, just to be clear, this is in MAESTROeX and not MAESTRO, correct?
It does appear that we are missing some OMP reductions in the diag code.
— Reply to this email directly, view it on GitHub https://github.com/AMReX-Astro/MAESTROeX/issues/454#issuecomment-2113141043, or unsubscribe https://github.com/notifications/unsubscribe-auth/APC42MLJI5EJYLN3GE7VNX3ZCOPT3AVCNFSM6AAAAABHYU2EWGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJTGE2DCMBUGM . You are receiving this because you authored the thread.Message ID: @.***>
we can fix it. Give me a bit. There seem to be a few issues.
we haven't really been running with OpenMP much, so it is not tested as well as MPI + CUDA / GPUs.
in the meantime, you can just remove the OpenMP pragma before the MFIter look in MaestroDiag.cpp
:
note: you should not run with GPUs and OpenMP.
GPU support is via CUDA, so you would compile with OpenMP diasabled.
PR #455 should fix MPI + OpenMP if you can test it out.
I believe that this is fixed. Reopen if you still have issues.
Hi, Now I'm running with OMP only. But I have segfault issue. This is the backtrace.0.0 file:
Host Name: node104
=== If no file names and line numbers are shown below, one can run
addr2line -Cpfie my_exefile my_line_address
to convert my_line_address
(e.g., 0x4a6b) into file name and line number.
Or one can use amrex/Tools/Backtrace/parse_bt.py.
=== Please note that the line number reported by addr2line may not be accurate. One can use readelf -wl my_exefile | grep my_line_address' to find out the offset for that line.
0: ./Maestro3d.gnu.OMP.ex() [0x73da50] amrex::BLBackTrace::print_backtrace_info(_IO_FILE*) /scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/wdconvect/../../../external/amrex/Src/Base/AMReX_BLBackTrace.cpp:200:36
1: ./Maestro3d.gnu.OMP.ex() [0x7438df] amrex::BLBackTrace::handler(int) /scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/wdconvect/../../../external/amrex/Src/Base/AMReX_BLBackTrace.cpp:100:15
2: /lib64/libc.so.6(+0x4e5b0) [0x7fb2493835b0]
3: ./Maestro3d.gnu.OMP.ex() [0x4c3b07]
std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator
4: ./Maestro3d.gnu.OMP.ex() [0x4c06b9]
std::cxx11::basic_string<char, std::char_traits
5: ./Maestro3d.gnu.OMP.ex() [0x424d0d] main /scratch/p310347/DVR-time-prediction/data/MAESTROeX/Exec/science/wdconvect/../../../Source/main.cpp:63:52
6: /lib64/libc.so.6(libc_start_main+0xe5) [0x7fb24936f7e5] libc_start_main /usr/src/debug/glibc-2.28-251.el8.2.x86_64/csu/../csu/libc-start.c:336:3
7: ./Maestro3d.gnu.OMP.ex() [0x434cfe] _start at ??:?
can you share the job_info
file that is output in any of the plotfile directories? This is failing at:
125 if (std::filesystem::exists("plot_and_continue")) {
126 remove("plot_and_continue");
127 do_plotfile = true;
128 }
which is odd.
it started to work after I updated the code to the lasted version. thank you:)
I'm running wdconvect problem with OMP activated only. The input file is inputs_3d_C.128. But the simulation seems very unstable and stops at some point occasionally. It gives me the following message: amrex::Abort::0::ERROR: ncenter invalid in Diag() !!!
And when I run the simulation with GPU and OMP, I have segment fault.