madgraph5 / madgraph4gpu

GPU development for the Madgraph5_aMC@NLO event generator software package
30 stars 32 forks source link

Library path to AMD libraries may not be correct in certain cases #1020

Open Qubitol opened 4 weeks ago

Qubitol commented 4 weeks ago

Hi,

I've done some tests on a machine with an AMD GPU. However, when running make, it fails when trying to look for libamdhip64.

The problem in my case

See this line in cudacpp.mk:

$(FC) -o $@ $(BUILDDIR)/fcheck_sa_fortran.o $(BUILDDIR)/fsampler_$(GPUSUFFIX).o $(LIBFLAGS) -lgfortran -L$(LIBDIR) -l$(MG5AMC_GPULIB) $(gpu_objects_exe) -lstdc++ -L$(shell cd -L $(shell dirname $(shell $(GPUCC) -print-prog-name=clang))/../..; pwd)/lib -lamdhip64

In my case shell dirname $(shell $(GPUCC) -print-prog-name=clang) yields /opt/rocm-6.2.2/lib/llvm/bin, then it goes up 2 levels and appends /lib, resulting in the final path /opt/rocm-6.2.2/lib/lib, which is not correct in my case (should be /opt/rocm-6.2.2/lib).

Possible solution to make it dynamic

I found the command hipconfig that displays a bunch of information about the installation. One could use then:

hipconfig --rocmpath

to get exactly /opt/rocm-6.2.2, and then append /lib. Now, this would be the dynamic approach, but it requires hipconfig to be available. Is this the case for every HIP-based GPU? In such case, I can submit a PR fixing the makefiles.

valassi commented 4 weeks ago

Hi @Qubitol nice find. This is what I see on LUMI where I do all my tests

On LUMI login nodes

[valassia@uan02 bash] ~ > echo $(hipconfig --rocmpath)
/opt/rocm-6.0.3

On LUMI worker nodes

[valassia@nid005003 bash] ~ > echo $(hipconfig --rocmpath)
/opt/rocm-6.0.3

So, yes, this looks like a good solution. Thanks! Andrea