njoy / NJOY2016

Nuclear data processing with legacy NJOY
https://www.njoy21.io/NJOY2016
Other
98 stars 86 forks source link

Compiler segmentation fault when compiling NJOY2016 with gcc-11 #211

Closed roberto160275 closed 2 years ago

roberto160275 commented 3 years ago

Hi, I have tried to compile NJOY21 from source as explained here (https://docs.njoy21.io/install.html) and compilation failed due to a segmentation error while doing 'make'. There are many warnings during compilation too. Below is the output from 'make'. I'd appreciate some help to work this problem out.

Greetings,

Roberto.

Vajrayana:bin roberto$ make [ 1%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/locale.f90.o [ 2%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/mainio.f90.o [ 4%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/util.f90.o [ 5%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/endf.f90.o [ 7%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/mathm.f90.o [ 8%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/phys.f90.o [ 10%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/acecm.f90.o [ 11%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/acedo.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acedo.f90:30:22:

30 | real(kr)::xss(nxss) | 1 Warning: Array 'xss' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] [ 12%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/acefc.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:6964:46:

6959 | do ki=1,js | 2
...... 6964 | +renormyys(ki)(xxs(ki)-xxs(ki-1)) | 1 Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2) [-Wdo-subscript] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:6966:57:

6959 | do ki=1,js | 2
...... 6966 | xss(ki+2js-1+nexd)+renorm(yys(ki)+yys(ki-1))& | 1 Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2) [-Wdo-subscript] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:6967:31:

6959 | do ki=1,js | 2
...... 6967 | *(xxs(ki)-xxs(ki-1))/2 | 1 Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2) [-Wdo-subscript] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:86:22:

86 | real(kr)::xss(nxss) | 1 Warning: Array 'xss' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:15054:23:

15054 | real(kr)::ee(pltumx),s0(pltumx),s1(pltumx),s2(pltumx) | 1 Warning: Array 'ee' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:15054:34:

15054 | real(kr)::ee(pltumx),s0(pltumx),s1(pltumx),s2(pltumx) | 1 Warning: Array 's0' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:15054:45:

15054 | real(kr)::ee(pltumx),s0(pltumx),s1(pltumx),s2(pltumx) | 1 Warning: Array 's1' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:15054:56:

15054 | real(kr)::ee(pltumx),s0(pltumx),s1(pltumx),s2(pltumx) | 1 Warning: Array 's2' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acefc.f90:3343:21:

3343 | real(kr)::a(namax) | 1 Warning: Array 'a' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] [ 14%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/acepa.f90.o [ 15%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/acepn.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/acepn.f90:26:22:

26 | real(kr)::xss(nxss) | 1 Warning: Array 'xss' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] [ 17%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/aceth.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/aceth.f90:26:22:

26 | real(kr)::xss(nxss) | 1 Warning: Array 'xss' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] [ 18%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/acer.f90.o [ 20%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/broadr.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/broadr.f90:1407:11:

1407 | em.gt.sigfig(es(is-1),ndig,-1)) go to 150 | 1 Warning: Impure function 'sigfig' at (1) might not be evaluated [-Wfunction-elimination] [ 21%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/ccccr.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/ccccr.f90:945:34:

935 | do i=1,ngps | 2
...... 945 | ispec=nint(spec(i-1)*ichid)+ispec-1 | 1 Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2) [-Wdo-subscript] [ 22%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/covr.f90.o [ 24%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/dtfr.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/dtfr.f90:290:23:

289 | do i=1,nwsmax | 2
290 | if (i.le.n3) ids(i)=0 | 1 Warning: Array reference at (1) out of bounds (500000 > 53) in loop beginning at (2) [-Wdo-subscript] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/dtfr.f90:46:22:

46 | real(kr)::a(nwamax) | 1 Warning: Array 'a' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/dtfr.f90:48:24:

48 | real(kr)::sig(nwsmax) | 1 Warning: Array 'sig' at (1) is larger than limit set by '-fmax-stack-var-size=', moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider using '-frecursive', or increase the '-fmax-stack-var-size=' limit, or change the code to use an ALLOCATABLE array. [-Wsurprising] [ 25%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/groupr.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4413:25:

4413 | egn(ig)=eg24(ig) | ^ Warning: iteration 282 invokes undefined behavior [-Waggressive-loop-optimizations] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4412:12:

4412 | do ig=1,ngp | ^ note: within this loop /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4422:25:

4422 | egn(ig)=eg25(ig) | ^ Warning: iteration 296 invokes undefined behavior [-Waggressive-loop-optimizations] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4421:12:

4421 | do ig=1,ngp | ^ note: within this loop /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4431:25:

4431 | egn(ig)=eg26(ig) | ^ Warning: iteration 362 invokes undefined behavior [-Waggressive-loop-optimizations] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4430:12:

4430 | do ig=1,ngp | ^ note: within this loop /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4440:25:

4440 | egn(ig)=eg27(ig) | ^ Warning: iteration 316 invokes undefined behavior [-Waggressive-loop-optimizations] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4439:12:

4439 | do ig=1,ngp | ^ note: within this loop /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4449:25:

4449 | egn(ig)=eg28(ig) | ^ Warning: iteration 90 invokes undefined behavior [-Waggressive-loop-optimizations] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/groupr.f90:4448:12:

4448 | do ig=1,ngp | ^ note: within this loop [ 27%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/samm.f90.o /Users/roberto/NJOY21/bin/_deps/njoy-src/src/samm.f90:4518:38:

4515 | do n=1,100 | 2
...... 4518 | a(n)=-(delta(xn-1)(xn-2)*a(n-1)+& | 1 Warning: Array reference at (1) out of bounds (0 < 1) in loop beginning at (2) [-Wdo-subscript] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/samm.f90:4519:37:

4515 | do n=1,100 | 2
...... 4519 | (rhoi-2eta)(delta*2)a(n-2)+(delta*3)a(n-3))/& | 1 Warning: Array reference at (1) out of bounds (-1 < 1) in loop beginning at (2) [-Wdo-subscript] /Users/roberto/NJOY21/bin/_deps/njoy-src/src/samm.f90:4519:55:

4515 | do n=1,100 | 2
...... 4519 | (rhoi-2eta)(delta2)*a(n-2)+(delta3)a(n-3))/& | 1 Warning: Array reference at (1) out of bounds (-2 < 1) in loop beginning at (2) [-Wdo-subscript] [ 28%] Building Fortran object _deps/njoy-build/CMakeFiles/njoy.dir/src/errorr.f90.o f951: internal compiler error: Segmentation fault: 11 Please submit a full bug report, with preprocessed source if appropriate. See https://github.com/Homebrew/homebrew-core/issues for instructions. make[2]: [_deps/njoy-build/CMakeFiles/njoy.dir/src/errorr.f90.o] Error 1 make[1]: * [_deps/njoy-build/CMakeFiles/njoy.dir/all] Error 2 make: *** [all] Error 2

whaeck commented 3 years ago

I would like to ask you for some more information about your compiler version (I assume that the compiler you are using is the GNU fortran compiler). We have been compiling NJOY2016 (which is what NJOY21 uses under the hood) with gfortran 7, 8 and 9 without issues for years now. I haven't tried with 10 or 11 (although I regularly use those for C++ compilation).

I would suggest trying with gcc-9, since this is the version that we use in NJOY's CI with GitHub actions - and it seems to work consistently.

Also, instead of using NJOY21 you may want to try compiling NJOY2016 as well to see if the problem persists if you compile NJOY2016 as a standalone executable (NJOY21 compiles NJOY2016 as a library with different settings, so it might be due to that).

roberto160275 commented 3 years ago

Hi Wim,

thank you for your reply. I think I’m using the gfortran 11 version as you can see in the attached file. I’ll try to install gcc 9 and see if that works (I have to find out how to uninstall 11 and install 9). I have tried installing NJOY2016 and I got the same segmentation fault error. Do you have any idea why I get so many warnings during compilation?

Greetings,

Roberto.

El 7 jul. 2021, a las 12:27, Wim Haeck @.***> escribió:

I would like to ask you for some more information about your compiler version (I assume that the compiler you are using is the GNU fortran compiler). We have been compiling NJOY2016 (which is what NJOY21 uses under the hood) with gfortran 7, 8 and 9 without issues for years now. I haven't tried with 10 or 11 (although I regularly use those for C++ compilation).

I would suggest trying with gcc-9, since this is the version that we use in NJOY's CI with GitHub actions - and it seems to work consistently.

Also, instead of using NJOY21 you may want to try compiling NJOY2016 as well to see if the problem persists if you compile NJOY2016 as a standalone executable (NJOY21 compiles NJOY2016 as a library with different settings, so it might be due to that).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/njoy/NJOY21/issues/151#issuecomment-875701261, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUYAIDPCXB32DJO4OTVRDFTTWRW6ZANCNFSM475H2SGA.

whaeck commented 3 years ago

I just tried using gcc-11 to compile NJOY2016, and I'm seeing the segfault you're describing here. As a result, I'm moving this issue to NJOY2016 itself.

The warnings are normal, since we've switched on the compiler warning flags. If you compiled NJOY2012 or NJOY99 with these flags switched on, you'd see the same warnings.

roberto160275 commented 3 years ago

I have installed gcc 7, 8 & 9 and then i pointed the /usr/local/bin/gcc link to either /usr/local/bin/gcc-7(8 or 9) and this gives the same compilation error. I should add that this happens with either NJOY2016 or NJOY21. Frankly, I'm at a loss here. I'm still investigating what the problem might be but I'm running out of ideas.

One odd thing I found is that, when running the command

cmake -D CMAKE_BUILD_TYPE=Release ..

I get the following output

Vajrayana:bin roberto$ cmake -D CMAKE_BUILD_TYPE=Release .. -- Found Python3: /usr/bin/python3 (found suitable version "3.8.2", minimum required is "3.5") found components: Interpreter -- The Fortran compiler identification is GNU 11.1.0 -- Checking whether Fortran compiler has -isysroot -- Checking whether Fortran compiler has -isysroot - yes -- Checking whether Fortran compiler supports OSX deployment target flag -- Checking whether Fortran compiler supports OSX deployment target flag - yes -- Detecting Fortran compiler ABI info -- Detecting Fortran compiler ABI info - done -- Check for working Fortran compiler: /usr/local/bin/gfortran - skipped -- Checking whether /usr/local/bin/gfortran supports Fortran 90 -- Checking whether /usr/local/bin/gfortran supports Fortran 90 - yes -- Found Git: /usr/local/bin/git (found version "2.32.0") --


--

-- njoy

-- Git current branch: master

-- Git commit hash: 6ef2a1d04ba3cad3c114abaaf6069e317364158d

--


-- Configuring done

-- Generating done

-- Build files have been written to: /Users/roberto/NJOY2016/bin

I'm surprised to find that the OS finds the fortran compiler from GNU 11 although I'm pointing to gcc-8.

roberto160275 commented 3 years ago

I have solved my problem. The problem was that I was using gfortran 11 to compile. Linking to gfortran-8 allowed me to compile both versions of NJOY without problems.

Thank you for your help.

Greetings,

Roberto.

whaeck commented 3 years ago

CMake selects the compiler automatically, so it continued to select gfortran-11. You probably figured out you could use the following to select the appropriate compiler:

cmake -D CMAKE_Fortran_COMPILER=gfortran-8 -D CMAKE_BUILD_TYPE=Release ..

At least we solved the immediate problem for you. I'm leaving this issue open to make sure we look into this segfault with gfortran-11, that's a really annoying problem that we will have to fix at some point.

roberto160275 commented 3 years ago

CMake selects the compiler automatically, so it continued to select gfortran-11. You probably figured out you could use the following to select the appropriate compiler:

cmake -D CMAKE_Fortran_COMPILER=gfortran-8 -D CMAKE_BUILD_TYPE=Release ..

At least we solved the immediate problem for you. I'm leaving this issue open to make sure we look into this segfault with gfortran-11, that's a really annoying problem that we will have to fix at some point.

What I did was to change the file that /lusr/local/bin/gfortan points to

Vajrayana:NJOY21 roberto$ ln -sf /usr/local/Cellar/gcc@8/8.5.0/bin/gfortran-8 /usr/local/bin/gfortran

Maybe it was not the best way to handle it. Thank you for the new bit of info. I didn't know I could do that.

Greetings,

Roberto.

whaeck commented 3 years ago

While it is not "wrong" to it that way, using -D CMAKE_Fortran_COMPILER=gfortran-8 is the safer way though.

Glad to have been of assistance ;-)

roberto160275 commented 3 years ago

While it is not "wrong" to it that way, using -D CMAKE_Fortran_COMPILER=gfortran-8 is the safer way though.

Glad to have been of assistance ;-)

Oh, ok. Thank you very much. :-)

XuShuqi7 commented 3 years ago

This problem came to me too. I really appreciate to get the answers here, to turn to the gfortran-8. Thanks a lot!

jchsublet commented 2 years ago

@whaeck here I am making the assumption that errorr.f90 is the same in NJOY21 and NJOY2016 and that it is compiled directly with a Fortran compiler. The f951: internal compiler error: Segmentation fault: 11 also occurs on OsX 12 Monterey using Gfortran_11.2.0 with NJOY2016.65

The segmentation fault occurs regardless of the FFlags settings in cmake or make using the above compiler version, however

Makeliste_gfortran-11.2.0.txt Makeliste_ifort-2021.5.0.txt

It may be worth reporting the issue to https://gcc.gnu.org/bugs/ as it is unlikely to go away on its own, that GCC increment are there for a reasons. Using 8 or 9 when 11, 12 have been released is not sustainable

whaeck commented 2 years ago

Yes, ERRORR in NJOY21 is the module from NJOY2016 so this suffers from the same issue with gcc-11.

I've recently installed gcc-11 on the foreign travel laptop that I am currently using and it also fails to compile NJOY2016, I'm getting the internal segfault as well.

I was planning on reporting this to gcc at some point so now is as good as any other time.

whaeck commented 2 years ago

Some more details on what is going on here.

When we remove the ERRORR source code from NJOY2016, gcc-11 can compile the code without issues so the problem is internal to ERRORR.

I have been trying to find where the internal compiler error occurs and I have been somewhat successful, although the reason why it happens continues to allude me.

The compiler seems to walk through the source code and goes into the source code for each subroutine when it gets called (i.e. when we encounter a "call ..." statement). I have been following the code flow and through the following calls:

Commenting out all call ... statements (call contio, call moreio, call error, call mess, etc.) in subroutine resprx and commenting out call rpxsamm, call rpxlc0, call rpxlc12 in the rpxunr subroutine makes the internal compiler error go away. Adding ANY call statement anywhere in subroutine resprx (even adding a simple call mess(...)) makes the internal compiler error appear again.

Adding any number of call mess in the resprx subroutine on the other hand does not cause the error to appear. I'm thinking that the issue might be due to call depth or inlining? My next step is to see of I can replicate this using a very simple program with multiple subroutines that call other subroutines.

whaeck commented 2 years ago

We are not supporting gcc-11. The release notes are updated to explain this.

Closing this as a result.

YaqiWang commented 1 year ago

@whaeck were you able to replicate this using a very simple program? I am hitting this error with GNU Fortran (GCC) 11.0.1 20210403. It will be quite painful for me to switch compilers. I am curious whether we can have GNU Fortran to have this fixed or find a workaround on NJOY side to bypass this compiling error. Thanks!

whaeck commented 1 year ago

I have genuinely tried to reproduce this internal compiler error with a simplified program but I have not succeeded. The best solution is still to either downgrade to gcc-10 or upgrade to gcc-12.

That being said, if you do NOT need to process covariances (i.e. you do not use the ERRORR module), you can work around the issue by excluding ERRORR from the compilation entirely (the issue is entirely due to the compilation of errorr.f90). You will have to modify the CMakeLists.txt file to remove errorr.f90 from the list of sources, and remove it from the main driver in main.f90 (remove the 'use errorm' line and remove the case for errorr in the select case statement). That will compile NJOY without the ERRORR module and it should compile normally.

YaqiWang commented 1 year ago

Thank you so much. Both are valid solutions. I will give gcc-12 a try.