...
compflow: Max relative pressure change during time step: 4.598 %
compflow: Max absolute phase volume fraction change during time step: 0.052
compflow: Time-step required will be increased based on state change.
Time: 2.58e+04 s, dt: 8163.258750212592 s, Cycle: 4
Attempt: 0, ConfigurationIter: 0, NewtonIter: 0
( Rflow ) = ( 5.96e-02 ) ( R ) = ( 5.96e-02 )
MGR preconditioner: numComponentsPerField = [3]
Linear Solver | Success | Iterations: 22 | Final Rel Res: 0.00064163 | Make Restrictor Time: 0 | Compute Auu Time: 0 | SC Filter Time: 0 | Setup Time: 45.5268 s | Solve Time: 217.744 s
compflow: Max pressure change: 1575956.141 Pa (before scaling)
compflow: Max component density change: 45.239 kg/m3 (before scaling)
compflow: Global solution scaling factor = 1
compflow: Max deltaPhaseVolFrac = 0.045105931221652684
Attempt: 0, ConfigurationIter: 0, NewtonIter: 1
( Rflow ) = ( 1.57e-02 ) ( R ) = ( 1.57e-02 )
Last LinSolve(iter,res) = ( 22, 6.42e-04 )
MGR preconditioner: numComponentsPerField = [3]
Linear Solver | Success | Iterations: 16 | Final Rel Res: 0.000614216 | Make Restrictor Time: 0 | Compute Auu Time: 0 | SC Filter Time: 0 | Setup Time: 45.7587 s | Solve Time: 152.552 s
compflow: Max pressure change: 119510.797 Pa (before scaling)
compflow: Max component density change: 2.287 kg/m3 (before scaling)
compflow: Global solution scaling factor = 1
compflow: Max deltaPhaseVolFrac = 0.0022841038896992405
Attempt: 0, ConfigurationIter: 0, NewtonIter: 2
( Rflow ) = ( 7.33e-05 ) ( R ) = ( 7.33e-05 )
Last LinSolve(iter,res) = ( 16, 6.14e-04 )
compflow: Max relative pressure change during time step: 4.554 %
compflow: Max absolute phase volume fraction change during time step: 0.047
compflow: Time-step required will be increased based on state change.
Time: 3.39e+04 s, dt: 11077.291615478814 s, Cycle: 5
Attempt: 0, ConfigurationIter: 0, NewtonIter: 0
( Rflow ) = ( 5.63e-02 ) ( R ) = ( 5.63e-02 )
MGR preconditioner: numComponentsPerField = [3]
Linear Solver | Success | Iterations: 27 | Final Rel Res: 0.000747275 | Make Restrictor Time: 0 | Compute Auu Time: 0 | SC Filter Time: 0 | Setup Time: 45.8097 s | Solve Time: 275.458 s
compflow: Max pressure change: 1567308.382 Pa (before scaling)
compflow: Max component density change: 47.543 kg/m3 (before scaling)
compflow: Global solution scaling factor = 1
compflow: Max deltaPhaseVolFrac = 0.04744675929757902
Attempt: 0, ConfigurationIter: 0, NewtonIter: 1
( Rflow ) = ( 1.75e-02 ) ( R ) = ( 1.75e-02 )
Last LinSolve(iter,res) = ( 27, 7.47e-04 )
MGR preconditioner: numComponentsPerField = [3]
Linear Solver | Success | Iterations: 17 | Final Rel Res: 0.00090827 | Make Restrictor Time: 0 | Compute Auu Time: 0 | SC Filter Time: 0 | Setup Time: 45.9779 s | Solve Time: 162.657 s
compflow: Max pressure change: 143952.444 Pa (before scaling)
compflow: Max component density change: 2.073 kg/m3 (before scaling)
compflow: Global solution scaling factor = 1
compflow: Max deltaPhaseVolFrac = 0.0020719133774779186
Attempt: 0, ConfigurationIter: 0, NewtonIter: 2
( Rflow ) = ( 6.98e-05 ) ( R ) = ( 6.98e-05 )
Last LinSolve(iter,res) = ( 17, 9.08e-04 )
compflow: Max relative pressure change during time step: 4.310 %
compflow: Max absolute phase volume fraction change during time step: 0.047
compflow: Time-step required will be increased based on state change.
Time: 4.50e+04 s, dt: 15052.402670751115 s, Cycle: 6
Attempt: 0, ConfigurationIter: 0, NewtonIter: 0
( Rflow ) = ( 6.06e-02 ) ( R ) = ( 6.06e-02 )
MGR preconditioner: numComponentsPerField = [3]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2aab053e4700 (LWP 74345)]
hypre_SeqVectorSetConstantValuesHost._omp_fn.0 () at vector.c:340
340 vector.c: No such file or directory.
Missing separate debuginfos, use: debuginfo-install blas-3.4.2-8.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-326.el7_9.x86_64 lapack-3.4.2-8.el7.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-11.el7.x86_64 libgfortran-4.8.5-44.el7.x86_64 libibverbs-58mlnx43-1.58203.x86_64 libnl3-3.2.28-4.el7.x86_64 librdmacm-58mlnx43-1.58203.x86_64 libxpmem-2.6.4-1.58203.rhel7u9.x86_64 numactl-libs-2.0.12-5.el7.x86_64 sssd-client-1.16.5-10.el7_9.15.x86_64 systemd-libs-219-78.el7_9.7.x86_64 xz-libs-5.2.2-2.el7_9.x86_64
(gdb) where
#0 hypre_SeqVectorSetConstantValuesHost._omp_fn.0 () at vector.c:340
#1 0x00002aaab6ec8a86 in gomp_thread_start (xdata=<optimized out>) at ../../../libgomp/team.c:123
#2 0x00002aaab617dea5 in start_thread () from /lib64/libpthread.so.0
#3 0x00002aaab7403b0d in clone () from /lib64/libc.so.6
In a debugger we can see :
#0 hypre_SeqVectorSetConstantValuesHost._omp_fn.0 () at vector.c:340
#1 0x00002aaab6ec8a86 in gomp_thread_start (xdata=<optimized out>) at ../../../libgomp/team.c:123
#2 0x00002aaab617dea5 in start_thread () from /lib64/libpthread.so.0
#3 0x00002aaab7403b0d in clone () from /lib64/libc.so.6
(gdb) directory /home/mtml/src/GEOS/thirdPartyLibs/
.git/ .gitattributes .github/ .gitignore .gitmodules CMakeLists.txt cmake/ docker/ host-configs/ scripts/ tpl.cpp tplMirror/
(gdb) !which geosx
/data/saet/mtml/software/x86_64/RHEL7/GEOS/0.2.0/install-CPU-OPTO1-Hypre-GCC_10.2.0-ompi_hpcx-OMP-relwithdebinfo/bin/geosx
(gdb) list
335 in vector.c
(gdb) info threads
Id Target Id Frame
* 8 Thread 0x2aab053e4700 (LWP 74345) "geosx" hypre_SeqVectorSetConstantValuesHost._omp_fn.0 () at vector.c:340
7 Thread 0x2aaae19d0700 (LWP 74344) "geosx" hypre_SeqVectorSetConstantValuesHost._omp_fn.0 () at vector.c:340
6 Thread 0x2aaae095d700 (LWP 74343) "geosx" futex_wait (val=455624, addr=0xfab934) at ../../../libgomp/config/linux/x86/futex.h:44
5 Thread 0x2aaad7348700 (LWP 74321) "async" 0x00002aaab74040e3 in epoll_wait () from /lib64/libc.so.6
4 Thread 0x2aaad57eb700 (LWP 74318) "fuse" 0x00002aaab618475d in read () from /lib64/libpthread.so.0
3 Thread 0x2aaaca54a700 (LWP 74307) "geosx" 0x00002aaab74040e3 in epoll_wait () from /lib64/libc.so.6
2 Thread 0x2aaac7534700 (LWP 74277) "geosx" 0x00002aaab74040e3 in epoll_wait () from /lib64/libc.so.6
1 Thread 0x2aaaaab44ec0 (LWP 74014) "geosx" futex_wait (val=455624, addr=0xfab934) at ../../../libgomp/config/linux/x86/futex.h:44
(gdb) print
`
When setting a breakpoint at the function that fails we get:
`Breakpoint 1, hypre_SeqVectorSetConstantValuesHost (v=0x1099950, value=value@entry=0) at vector.c:328
328 vector.c: No such file or directory.
(gdb) where
#0 hypre_SeqVectorSetConstantValuesHost (v=0x1099950, value=value@entry=0) at vector.c:328
#1 0x00002aaaae97e4f8 in hypre_SeqVectorSetConstantValues (v=<optimized out>, value=value@entry=0) at vector.c:378
#2 0x00002aaaae966c7e in hypre_ParVectorSetConstantValues (v=<optimized out>, value=value@entry=0) at par_vector.c:327
#3 0x00002aaaac8e3613 in geos::HypreVector::create (this=this@entry=0xe61520, localSize=<optimized out>, comm=<optimized out>) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/linearAlgebra/interfaces/hypre/HypreVector.cpp:115
#4 0x00002aaaacb3a844 in geos::SolverBase::setupSystem (this=0xe612b0, domain=..., dofManager=..., localMatrix=..., rhs=..., solution=..., setSparsity=true) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/physicsSolvers/SolverBase.cpp:1078
#5 0x00002aaaacb3566f in geos::SolverBase::solverStep (this=0xe612b0, time_n=@0x7fffffff4998: 0, dt=@0x7fffffff4988: 10000, cycleNumber=0, domain=...) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/physicsSolvers/SolverBase.cpp:218
#6 0x00002aaaacb36bcd in geos::SolverBase::execute (this=0xe612b0, time_n=0, dt=10000, cycleNumber=0, domain=...) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/physicsSolvers/SolverBase.cpp:251
#7 0x00002aaaae7fd870 in geos::EventBase::execute (this=0xe48200, time_n=0, dt=10000, cycleNumber=0, domain=...) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/events/EventBase.cpp:233
#8 0x00002aaaae801dc2 in geos::EventManager::run (this=this@entry=0xe36510, domain=...) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/events/EventManager.cpp:193
#9 0x00002aaaae814692 in geos::ProblemManager::runSimulation (this=<optimized out>) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/mainInterface/ProblemManager.cpp:1081
#10 0x00002aaaae811223 in geos::GeosxState::run (this=this@entry=0x7fffffff52a0) at /dev/shm/mtml/src/GEOS/GEOS/src/coreComponents/mainInterface/GeosxState.cpp:177
#11 0x000000000040b65f in main (argc=<optimized out>, argv=0x7fffffff55a8) at /dev/shm/mtml/src/GEOS/GEOS/src/main/main.cpp:46
Describe the bug OMP-enabled GEOS terminates abnormally with a SIGSEGV while simulating model "./SPE10_refined.xml" off [https://github.com/GEOS-DEV/MAELSTROM/tree/master/usecases/francois/SPE10/flow] when more than 1 OMP threads are active.
To Reproduce Steps to reproduce the behavior:
OMP_NUM_THREADS=20 mpirun -np 1 $(which geosx) -i ./SPE10_refined.xml \ -t runtime-report,max_column_width=200,calc.inclusive,mpi-report -x 1 -y 1 -z 1
A minimal test cases is just running the "SPE10_refined.xml" model off [https://github.com/GEOS-DEV/MAELSTROM/tree/master/usecases/francois/SPE10/flow]
Expected behavior The model is supposed to be able to run to completion.
Screenshots If applicable, add screenshots to help explain your problem.
Platform (please complete the following information):
Additional context Add any other context about the problem here.