Closed kweide closed 5 years ago
Stack trace from gdb for additional information:
klaus@asterix:/home/aladin/klaus/projects/amrex/Tutorials/Amr/Advection_F/Exec/SingleVortex$ OMP_NUM_THREADS=1 gdb ./main2d.gnu.DEBUG.MPI3.OMP.ex
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main2d.gnu.DEBUG.MPI3.OMP.ex...done.
(gdb) set args inputs
(gdb) b exit
Breakpoint 1 at 0x40ea60
(gdb) r
Starting program: /home/aladin/klaus/projects/amrex/Tutorials/Amr/Advection_F/Exec/SingleVortex/main2d.gnu.DEBUG.MPI3.OMP.ex inputs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff45ef700 (LWP 1820)]
MPI initialized with 1 MPI processes
OMP initialized with 1 OMP threads
Successfully read inputs file ...
STEP 1 starts ...
At line 461 of file ../../../../../Src/Base/AMReX_mempool_mod.F90
Fortran runtime error: Index '1' of dimension 4 of array 'a' below lower bound of 4294967295
Breakpoint 1, __GI_exit (status=2) at exit.c:104
104 exit.c: No such file or directory.
(gdb) where
#0 __GI_exit (status=2) at exit.c:104
#1 0x00007ffff73a624b in _gfortran_runtime_error_at () from /usr/lib/x86_64-linux-gnu/libgfortran.so.3
#2 0x0000000000601672 in amrex_mempool_module::bl_deallocate_r4 (a=...) at ../../../../../Src/Base/AMReX_mempool_mod.F90:461
#3 0x000000000066732b in amrex_fab_module::amrex_fab_destroy (fab=...) at ../../../../../Src/F_Interfaces/Base/AMReX_fab_mod.F90:89
#4 0x0000000000667b9e in amrex_fab_module::amrex_fab_build_alloc (fab=..., bx=..., nc=2) at ../../../../../Src/F_Interfaces/Base/AMReX_fab_mod.F90:46
#5 0x000000000066647a in amrex_fab_module::amrex_fab_resize (this=..., bx=..., nc=2) at ../../../../../Src/F_Interfaces/Base/AMReX_fab_mod.F90:126
#6 0x00000000005b5c85 in __compute_dt_module_MOD_est_timestep._omp_fn.0 () at ../../../../../Tutorials/Amr/Advection_F/Source/compute_dt_mod.F90:70
#7 0x00000000005b4d7b in compute_dt_module::est_timestep (lev=0, time=0) at ../../../../../Tutorials/Amr/Advection_F/Source/compute_dt_mod.F90:65
#8 0x00000000005b5150 in compute_dt_module::compute_dt () at ../../../../../Tutorials/Amr/Advection_F/Source/compute_dt_mod.F90:24
#9 0x00000000005aef14 in evolve_module::evolve () at ../../../../../Tutorials/Amr/Advection_F/Source/evolve_mod.F90:33
#10 0x00000000005a665a in MAIN__ () at ../../../../../Tutorials/Amr/Advection_F/Source/fmain.F90:19
(gdb)
This appears to be a gnu compiler bug in OpenMP. In Tutorials/Amr/Advection_F/Source/compute_dt_mod.F90
, function est_timestep (lev, time)
, I modified the code to be
call amrex_print(u%bx)
print *, "nc = ", u%nc
print *, "p ", associated(u%fp)
!$omp parallel reduction(min:dt_est) private(umax,bx,u,mfi,p)
call amrex_print(u%bx)
print *, "nc = ", u%nc
print *, "p ", associated(u%fp)
call amrex_mfiter_build(mfi, phi_new(lev), tiling=.true.)
do while(mfi%next())
bx = mfi%nodaltilebox()
I also modified Src/F_Interfaces/Base/AMReX_fab_mod.F90
to make fp
public.
I ran with one thread, and I got
STEP 1 starts ...
((1,1) (1,1) (0,0))
nc = 0
p F
((71081616,32766) (32592,538976288) (1,1))
nc = 0
p T
So immediately after entering the omp parallel region, the private type(amrex_fab) u
became garbage. This seems to be a compiler bug.
Thanks for looking into this!
What compiler version were you using?
gcc 5.4.0
Hmm, isn't it completely expected that u
becomes (potentially) garbage after the !$om parallel
?
After all, it is listed in a private
clause, not firstprivate
.
amrex_fab
is defined as
type(amrex_box) :: bx
integer :: nc = 0
logical, private :: owner = .false.
type(c_ptr), private :: cp = c_null_ptr
real(amrex_real), private, pointer, dimension(:,:,:,:) :: fp => null()
So I think they should be initialized.
My understanding of Fortran standard could be wrong. But if I were wrong on this, the standard were seriously flawed, in my opinion.
I added a new function to amrex_fab
. We can call u%reset_omp_private()
immediately after omp parallel
. Note that this a very dangerous function. It will cause memory leak if it is called on fab that has memory already allocated.
Could you test to see if it works?
By the way, USE_MPI3
has not been tested thoroughly. So I do not recommend using it.
Thanks, I will do some testing later.
One would think that those default values in the declaration of type(amrex_fab)
should result in a well-defined state of the the 'privatized' copies of u
, but apparently not so...
I found these words in the OpenMP 4.5 specification under 2.15.3.3 (same text in current draft for OpenMP 5.0 under 2.22.2), and I think they apply here:
If any statement of the construct references a list item, a new list item of the same type and type parameters is allocated. This allocation occurs once for each task generated by the construct and once for each SIMD lane used by the construct. The initial value of the new list item is undefined. The initial status of a private pointer is undefined.
This is a Fortran-specific section.
Maybe your (and my) expectations are shaped by what would happen in C++ with in an analogous situation. The standard documents state in C/C++ section, just a few lines above the quoted text:
The new list item is initialized, or has an undefined initial value, as if it had been locally declared without an initializer.
I think you are right.
Later in that section,
Finalization of a list item of a finalizable type or subojects of a list item of a finalizable type occurs at the end of the region.
So private variables are finalized but not initialized. The finalizer could be called on uninitialized objects. Don't understand why it is this way.
It's just funny that C++ is safer than Fortran.
Later in that section,
Finalization of a list item of a finalizable type or subojects of a list item of a finalizable type occurs at the end of the region.
So private variables are finalized but not initialized. The finalizer could be called on uninitialized objects. Don't understand why it is this way.
Well, Fortran now has final routines but not really constructors, at least not in the same sense as C++ ...
Since OpenMP says the "initial value ... is undefined", I guess it is okay for a compiler to do the expected thing and define them anyway. Perhaps many compilers do. Maybe there is even a flag to turn this on for GFortran?
It's just not okay for code to assume that the compiler implements the expected initialization behavior.
Could you test to see if it works?
Works for me now.
(Same configuration as before; tried OMP_NUM_THREAD=1 and > 1, number of MPI tasks = 1 and > 1.)
I have tried two alternative ways to avoid the "undefined after privatization" situation. Both appear to work (in Advection_F, as before), no calls to reset_omp_private() necessary.
amrex_fab
variables as firstprivate
. Changed line, in the case of evolve_mod.F90
:
!$omp parallel private(mfi,bx,tbx,pin,pout,pux,puy,puz,pfx,pfy,pfz,pf,pfab,particles) firstprivate(uface,flux)
amrex_fab
variables as threadprivate
. Changed lines:
type(amrex_fab),save :: uface(amrex_spacedim)
type(amrex_fab),save :: flux(amrex_spacedim)
!$omp threadprivate(uface,flux)
...
!$omp parallel private(mfi,bx,tbx,pin,pout,pux,puy,puz,pfx,pfy,pfz,pf,pfab,particles)
I kinda like the firstprivate() way. What do you think?
This would depends on how firstprivate variables are initialized. If all it does is memcpy, that will be fine. But if they call assignment(=)
of amrex_fab
, then we will have problems because it will try to free an uninitialized pointer. I am not sure what the OpenMP standard says.
From OpenMP 4.5, 2.15.3.4:
25 If the original list item does not have the POINTER attribute, initialization of the new list items
26 occurs as if by intrinsic assignment, unless the original list item has the allocation status of not
27 currently allocated, in which case the new list items will have the same status.
:+1:
From openmp-TR7.pdf (draft OpenMP 5.0), 2.22.4.4:
If the original list item does not have the POINTER attribute, initialization of the new list items occurs as if by intrinsic assignment unless the list item has a type bound procedure as a defined assignment. If the original list item that does not have the POINTER attribute has the allocation status of unallocated, the new list items will have the same status.
:grimacing: The draft doesn't seem to say what happens when the "list item" does have "a type bound procedure as a defined assignment", at least I didn't find it.
Runtime failure of Advection_F compiled with debugging:
Some info on the environment:
Modified GNUmakefile by hand.