Closed valassi closed 7 months ago
This is fixed in PR #802.
I ended up using FC for linking all fortran/c++ together in cudacpp.mk (I did not touch madevent and am not even sure what we are doing there for linking fortrran/c++/hip). https://github.com/madgraph5/madgraph4gpu/pull/801/commits/5c27ed64ed7bd9ed37e439aac23284082c06e759
Note: in the end I went back to hip, gcc and gfortran. I had tried hip, clang and flang, which works in SA cudacpp, but fails as flang gives zillions of F90 errors on madevent files #804. To use gfortran, I also had to add -lpthread explicitly https://github.com/madgraph5/madgraph4gpu/pull/801/commits/2fc0d87823bdbfb461899e1454c3ea8a0b90490b
This can be closed.
I am doing some tests on the LUMI AMD GPU for PR #801 .
The gcheck.exe standard test seems ok.
However fgcheck.exe segfaults.
And also gdb does not help
I have done some poor man debugging by disabling stuff in fcheck_sa.f. It turns out that the error is in very simple stuff, already the READ statements.
The above is when I am using gfortran for fortran FC, and using the default cudacpp.mk where the link (of hip, fortran and c++) is done using hipcc. (For comparison, the same with nvcc works ok for cuda in my environments).
The only think that I was able to get to work, in this LUMI environment, involves two changes: one, use flang (hidden inside the ROC installation) instead of gfortran for FC; at the same time, use that same flang instead of hipcc for linking of fgcheck.exe, adding however
-lstdc++ -L /opt/rocm-5.2.3/lib/ -lamdhip64
to the link command.This is a problem I observed for fgceck.exe for now, but I guess that I would get the same for madevent? Maybe not, because it seems that we are actually linking madevent with the fortran compiler already (which is what would work here with flang). So I guess that we should probably always link fortran, c++ and GPU code with the fortran compiler? I will do more checks tomorrow.
By the way the issue above anyway does seem to need flang, so using gfortran for linking would not be ok if I tested this well. Maybe this would be easier with some nicer compiler combinations. I am not sure if @Jooorgen had observed anything like this?
I keep the details here for reference.