Remove '-G' from debug compiler options for HIP

valassi commented 6 months ago

Still related to PR #801 HIP support, and to debugging the crash #806 n gqttq.

The 'make -f cudacpp.mk debug' fails for HIP

[valassia@nid005377 bash] ~/GPU2023/madgraph4gpuX/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gu_ttxu > make -f cudacpp.mk debug
cudacpp.mk:135: CUDA builds are not supported for multi-word CXX "CC --cray-bypass-pkgconfig -craype-verbose"
cudacpp.mk:148: HIP_HOME was not set: using "/opt/rocm"
cudacpp.mk:330: Using AVX='avx2' because host does not support avx512vl
OMPFLAGS=
AVX=avx2
FPTYPE=d
HELINL=0
HRDCOD=0
RNDGEN=hasNoCurand
Building in BUILDDIR=. for tag=avx2_d_inl0_hrd0_hasNoCurand (USEBUILDDIR is not set)
ccache /opt/rocm/bin/hipcc --shared -o ../../lib/libmg5amc_gu_ttxu_cuda.so ./gCPPProcess.o ./gMatrixElementKernels.o ./gBridgeKernels.o ./gCrossSectionKernels.o ./fbridge_cu.o -Xlinker -rpath='$ORIGIN' -L../../lib -lmg5amc_common
Traceback (most recent call last):
  File "/opt/rocm/bin/rocm_agent_enumerator", line 259, in <module>
    main()
  File "/opt/rocm/bin/rocm_agent_enumerator", line 243, in main
    target_list = readFromKFD()
  File "/opt/rocm/bin/rocm_agent_enumerator", line 202, in readFromKFD
    line = f.readline()
PermissionError: [Errno 1] Operation not permitted
ccache /opt/rocm/bin/hipcc  -g -O0 -G -I. -I../../src -I/opt/rocm/include/ -target x86_64-linux-gnu --offload-arch=gfx90a -DHIP_FAST_MATH -DHIP_PLATFORM=amd -fPIC -std=c++17 -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -fPIC -c gRamboSamplingKernels.cu -o gRamboSamplingKernels.o
clang-14: warning: argument unused during compilation: '-G -I.' [-Wunused-command-line-argument]
In file included from gRamboSamplingKernels.cu:13:
../../src/rambo.h:14:10: fatal error: 'CPPProcess.h' file not found
#include "CPPProcess.h"
         ^~~~~~~~~~~~~~
1 error generated when compiling for gfx90a.
make: *** [cudacpp.mk:561: gRamboSamplingKernels.o] Error 1

(Note: the readFromKFD error is always there and is harmless - I see it in every single LUMI build, all over the place)

valassi commented 6 months ago

Removing -G in this single file an dbuilding by hand is enough

[valassia@nid005377 bash] ~/GPU2023/madgraph4gpuX/epochX/cudacpp/gq_ttq.mad/SubProcesses/P1_gu_ttxu > ccache /opt/rocm/bin/hipcc  -g -O0 -I. -I../../src -I/opt/rocm/include/ -target x86_64-linux-gnu --offload-arch=gfx90a -DHIP_FAST_MATH -DHIP_PLATFORM=amd -fPIC -std=c++17 -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -fPIC -c gRamboSamplingKernels.cu -o gRamboSamplingKernels.o

Note: the problem is that CUOPTFLAG really refers to CUDA, but is also applied to HIP, it must be removed.

debug: OPTFLAGS   = -g -O0
debug: CUOPTFLAGS = -G
debug: MAKEDEBUG := debug
debug: all.$(TAG)

valassi commented 6 months ago

This is also fixed in PR #801 https://github.com/madgraph5/madgraph4gpu/pull/801/commits/6572c5d0afe9c26675e78876c4e68cab5206b49e

madgraph5 / madgraph4gpu

Remove '-G' from debug compiler options for HIP #808