Closed drbenmorgan closed 4 months ago
Can one of the admins verify this patch?
For simplicity of future update to CudaRdcUtils.cmake
, the git commit log should indicate which commit from either the Celeritas repository or the CudaRdc repository the file was copied from.
@drbenmorgan , both me an @JuanGonzalezCaminero are getting locally linking errors with this branch. I have the VecGeom master and cmake 3.28.3
[ 47%] Linking CXX executable ../BuildProducts/bin/test_atomic
/usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::ReturnTracks(std::vector<adeptint::TrackData, std::allocator<adeptint::TrackData> >*, int)'
/usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::ProcessGPUHits(HostScoring&, HostScoring::Stats&)'
/usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::InitScoringData(adeptint::VolAuxData*)'
/usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::GetUniformFieldZ()'
/usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::InitVolAuxData(adeptint::VolAuxData*, G4HepEmState*, bool, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*)'
/usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::CheckGeometry(G4HepEmState*)'
However, I checked and the symbols are defined in libAdePT_G4_integration.so
0000000000071410 T AdePTGeant4Integration::ReturnTracks(std::vector<adeptint::TrackData, std::allocator<adeptint::TrackData> >*, int)
The error is the same if I change manually the link order of libAdePT.so
and libAdePT_G4_integration.so
Any idea?
also getting when linking example1:
make[1]: *** [CMakeFiles/Makefile2:1376: test/CMakeFiles/test_atomic.dir/all] Error 2
/usr/bin/ld: /home/agheata/geant_src/VecGeom_install/lib/libvecgeom.a(UnplacedCone.cpp.o): undefined reference to symbol '_ZNK7vecgeom3cxx9DevicePtrINS_4cuda13SUnplacedConeINS2_9ConeTypes13UniversalConeEEEE9ConstructIJdddddddEEEvDpT_'
/usr/bin/ld: /home/agheata/geant_src/VecGeom_install/lib/libvecgeomcuda.so: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make[2]: *** [examples/Example1/CMakeFiles/example1.dir/build.make:392: BuildProducts/bin/example1] Error 1
@agheata, I've been chatting to @JuanGonzalezCaminero on Mattermost about this - I'm in the process of rebuilding my local stack with the newer CUDA and CMake that he has to see if I can reproduce. What I am now spotting is there is a undeclared (to CMake) circular dependence between libAdePT
and libAdePT_G4_integration
....
@pcanal may have insight on the
/usr/bin/ld: /home/agheata/geant_src/VecGeom_install/lib/libvecgeomcuda.so: error adding symbols: DSO missing from command line
error. I believe was seen as well in Celeritas and patched?
One other thing to check: what is the full list of Vecgeom libraries in your installs (e.g. under /home/agheata/geant_src/VecGeom_install/lib)? I've tried locally with both static and shared builds and am simply not able to reproduce. Could you also let me know the versions of VecGeom and Geant4 (down to commit) you are building against, and the exact set of build options used to configure and install them please?
There's some weird difference or specific configuration setup that's causing these problems for you but not me or the CI system.
@pcanal may have insight on the
/usr/bin/ld: /home/agheata/geant_src/VecGeom_install/lib/libvecgeomcuda.so: error adding symbols: DSO missing from command line
error. I believe was seen as well in Celeritas and patched?
Yes we have seem similar but slightly different. The fix was celeritas-project/celeritas@03b449a
(#1102). A description of the problem can be found at: https://zhangboyi.gitlab.io/post/2020-09-14-resolve-dso-missing-from-command-line-error/
The bottom-line is that you might need to add explicitly more library to the link library
list or .... something is messed up in the installation (eg. installation of both shared and static version of VecGeom in the same area or .... )
there is a undeclared (to CMake) circular dependence between
libAdePT
andlibAdePT_G4_integration
....
Indeed, we are looking now into how to break the dependency of lib AdePT on libAdePT_G4_integration
One other thing to check: what is the full list of Vecgeom libraries in your installs (e.g. under /home/agheata/geant_src/VecGeom_install/lib)?
libvecgeom.a libvecgeomcuda.so libvecgeomcuda_static.a libvgdml.a
Could you also let me know the versions of VecGeom and Geant4 (down to commit) you are building against, and the exact set of build options used to configure and install them please?
* 89ec50d4 (HEAD -> master, origin/master, origin/HEAD) VECGEOM-618 : Added new definition of SafetyToOut to fix SafetyToOut issue as mentioned in JIRA issue-618 [Raman Sehgal]
cmake $GEANT_ROOT/VecGeom \
-DCMAKE_INSTALL_PREFIX="$INSTALL_FOLDER" \
-DCMAKE_PREFIX_PATH="$VECCORE_ROOT" \
-DVECGEOM_GDML=ON \
-DVECGEOM_ROOT=OFF \
-DVECGEOM_GEANT4=OFF \
-DVECGEOM_ENABLE_CUDA=ON \
-DVECGEOM_BACKEND=Scalar \
-DVECGEOM_NO_SPECIALIZATION=ON \
-DVECGEOM_VECTOR=avx2 \
-DVECGEOM_TEST_BENCHMARK=ON \
-DVECGEOM_TEST_COVERAGE=OFF \
-DCMAKE_BUILD_TYPE=$BUILDMODE \
-DCMAKE_CUDA_ARCHITECTURES=75 \
-DUSE_CACHED_TRANSFORMATIONS=ON \
-DCMAKE_CXX_STANDARD=17 \
-DCMAKE_CUDA_STANDARD=17 \
-DEXAMPLES=ON
and for AdePT:
Geant4 v11.2.0
cmake -S. -B./adept-build/${VGMODE} \
-DCMAKE_INSTALL_PREFIX="$INSTALL_FOLDER" \
-DCMAKE_BUILD_TYPE=$BUILDMODE \
-DCMAKE_PREFIX_PATH="$VECCORE_ROOT;$VECGEOMROOT/lib/cmake/VecGeom;$HEPMC3_ROOT;$G4INSTALL;$G4HepEm_INSTALL" \
-DCMAKE_CUDA_ARCHITECTURES=75 \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON \
-DADEPT_USE_SURF=OFF \
-DBENCHMARK=ON \
-DBUILD_TESTING=ON
Thanks for the info @agheata, I think the crucial thing is that we have the same VecGeom library structure here, i.e. only a static libvecgeom
. Let's take one step back to the error you reported with the build of test_atomic
:
> [ 47%] Linking CXX executable ../BuildProducts/bin/test_atomic
> /usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::ReturnTracks(std::vector<adeptint::TrackData, std::allocator<adeptint::TrackData> >*, int)'
> /usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::ProcessGPUHits(HostScoring&, HostScoring::Stats&)'
> /usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::InitScoringData(adeptint::VolAuxData*)'
> /usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::GetUniformFieldZ()'
> /usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::InitVolAuxData(adeptint::VolAuxData*, G4HepEmState*, bool, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*)'
> /usr/bin/ld: ../BuildProducts/lib/libAdePT.so: undefined reference to `AdePTGeant4Integration::CheckGeometry(G4HepEmState*)'
Could you run:
$ make clean && make VERBOSE=1 test_atomic
and post the compile/link commands for the source file and end executable (should be the last two) please? For comparison, with my local, working build, I get (paths reduced for clarity):
[13/14] /.../nvcc -forward-unknown-to-host-compiler -O2 -g -DNDEBUG "--generate-code=arch=compute_86,code=[compute_86,sm_86]" /.../BuildProducts/lib64/libAdePT_static.a /.../lib64/libvecgeomcuda_static.a -Xnvlink --suppress-stack-size-warning -Xcompiler=-fPIC -Wno-deprecated-gpu-targets -shared -dlink test/CMakeFiles/test_atomic.dir/test_atomic.cu.o -o CMakeFiles/test_atomic.dir/cmake_device_link.o -L/.../lib/stubs -L/...cuda.../lib /.../libvgdml.a /.../libvecgeom.a -lrt -lpthread -ldl -lcudadevrt -lcudart
[14/14] : && /usr/bin/c++ -O2 -g -DNDEBUG test/CMakeFiles/test_atomic.dir/test_atomic.cu.o CMakeFiles/test_atomic.dir/cmake_device_link.o -o BuildProducts/bin/test_atomic -L/...cuda.../stubs -L/...cuda.../lib -Wl,-rpath,/.../ BuildProducts/lib64/libAdePT_G4_integration.so BuildProducts/lib64/libAdePT.so /.../libvecgeomcuda.so /.../libvgdml.a /.../libvecgeom.a -lrt -lpthread -ldl /...geant4, g4hepemlibs.../ -lcudadevrt -lcudart
This is what I get for test_atomic @drbenmorgan
[100%] Linking CUDA device code CMakeFiles/test_atomic.dir/cmake_device_link.o
cd /home/gjuan/Dev/G4/AdePT_Master/build/test && /usr/bin/cmake -E cmake_link_script CMakeFiles/test_atomic.dir/dlink.txt --verbose=1
/usr/local/cuda-12.3/bin/nvcc -forward-unknown-to-host-compiler -O3 -DNDEBUG "--generate-code=arch=compute_89,code=[compute_89,sm_89]" /.../libAdePT_static.a /.../libvecgeomcuda_static.a -Xnvlink --suppress-stack-size-warning -Xcompiler=-fPIC -Wno-deprecated-gpu-targets -shared -dlink --options-file CMakeFiles/test_atomic.dir/deviceObjects1.rsp -o CMakeFiles/test_atomic.dir/cmake_device_link.o --options-file CMakeFiles/test_atomic.dir/deviceLinkLibs.rsp
[100%] Linking CXX executable ../BuildProducts/bin/test_atomic
cd /home/gjuan/Dev/G4/AdePT_Master/build/test && /usr/bin/cmake -E cmake_link_script CMakeFiles/test_atomic.dir/link.txt --verbose=1
/usr/bin/c++ -O3 -DNDEBUG CMakeFiles/test_atomic.dir/test_atomic.cu.o CMakeFiles/test_atomic.dir/cmake_device_link.o -o ../BuildProducts/bin/test_atomic -L/usr/local/cuda-12.3/targets/x86_64-linux/lib/stubs -L/usr/local/cuda-12.3/targets/x86_64-linux/lib -L/usr/lib/gcc/x86_64-linux-gnu/12 -Wl,-rpath,/.../ ../BuildProducts/lib/libAdePT_G4_integration.so ../BuildProducts/lib/libAdePT.so /.../libvecgeomcuda.so /.../libvgdml.a /.../libvecgeom.a -lrt -lpthread -ldl /.../libG4Tree.so /.../libG4FR.so /.../libG4GMocren.so /.../libG4visHepRep.so /.../libG4RayTracer.so /.../libG4VRML.so /.../libG4ToolsSG.so /.../libG4vis_management.so /.../libG4modeling.so /.../libG4interfaces.so /.../libG4persistency.so /usr/lib/x86_64-linux-gnu/libxerces-c.so /.../libG4error_propagation.so /.../libG4readout.so /.../libG4physicslists.so /.../libG4run.so /.../libG4parmodels.so /.../libg4HepEm.so /.../libG4event.so /.../libG4tracking.so /.../libg4HepEmInit.so /.../libG4processes.so /.../libG4analysis.so /.../libG4digits_hits.so /.../libG4track.so /.../libG4geometry.so /.../libG4graphics_reps.so /.../libG4expat.so /.../libG4particles.so /.../libG4materials.so /.../libG4intercoms.so /.../libG4global.so /.../libG4ptl.so.2.3.3 /.../libG4zlib.so /.../libg4HepEmRun.so /.../libG4clhep.so /.../libg4HepEmData.so -lcudadevrt -lcudart
Thanks @JuanGonzalezCaminero, is that failing with the same error as @agheata reported earlier? The only difference I can spot is the use of response files for the device link step. Could you check/post the contents of the files:
test/CMakeFiles/test_atomic.dir/deviceObjects1.rsp
test/CMakeFiles/test_atomic.dir/deviceLinkLibs.rsp
please?
edit: clarified the path to these files from the build directory.
Yes, we were having the same errors, the files have:
deviceObjects1.rsp:
CMakeFiles/test_atomic.dir/test_atomic.cu.o
deviceLinkLibs.rsp
-L"/usr/local/cuda-12.3/targets/x86_64-linux/lib/stubs" -L"/usr/local/cuda-12.3/targets/x86_64-linux/lib" -L"/usr/lib/gcc/x86_64-linux-gnu/12" /home/gjuan/Dev/G4/vecgeom_installation/lib/libvgdml.a /home/gjuan/Dev/G4/vecgeom_installation/lib/libvecgeom.a -lrt -lpthread -ldl -lcudadevrt -lcudart
Indeed I'm able to build after the last commit @drbenmorgan
O.k., I have absolutely no idea why things are failing on these specific machines, so the latest commit is a shot in the dark. All it does is add a dummy .cu
file to AdePT_G4_integration
. This forces it to be treated as an RDC library and should then appear on the device link line for test_atomic
. This might resolve the missing symbol errors on the CERN machines, but if not then....
Thanks for the update @JuanGonzalezCaminero, that's something of a relief! If you could also check that your separate "findpackage_test" example building against an install also works, that'd cover that base as well.
All I can really imagine is that this is something to do with GCC - could you and @agheata double check and confirm which Linux, gcc and associated toolchain is in use please (i.e. gcc, gcc-ar gcc-ranlib etc)?
Ah, I wasn't building the examples, I confirm the commit solved the linking issue with AdePT_G4_Integration, however the vecgeom DSO missing from command line
error still appears, so I can't test the external linking yet.
Right, same exercise as before, so could you post the full link line for example1
please? It'll be the last one if you run
$ make clean && make VERBOSE=1 example1
This is what I get:
cd /home/gjuan/Dev/G4/AdePT_Master/build/examples/Example1 && /usr/bin/cmake -E cmake_link_script CMakeFiles/example1.dir/link.txt --verbose=1
/usr/bin/c++ -O3 -DNDEBUG CMakeFiles/example1.dir/example1.cpp.o CMakeFiles/example1.dir/__/common/src/ParticleGun.cc.o CMakeFiles/example1.dir/__/common/src/ParticleGunMessenger.cc.o CMakeFiles/example1.dir/__/common/src/FTFP_BERT_HepEm.cc.o CMakeFiles/example1.dir/__/common/src/FTFP_BERT_AdePT.cc.o CMakeFiles/example1.dir/src/ActionInitialisation.cc.o CMakeFiles/example1.dir/src/DetectorConstruction.cc.o CMakeFiles/example1.dir/src/DetectorMessenger.cc.o CMakeFiles/example1.dir/src/EventAction.cc.o CMakeFiles/example1.dir/src/EventActionMessenger.cc.o CMakeFiles/example1.dir/src/SteppingAction.cc.o CMakeFiles/example1.dir/src/SimpleHit.cc.o CMakeFiles/example1.dir/src/PrimaryGeneratorAction.cc.o CMakeFiles/example1.dir/src/RunAction.cc.o CMakeFiles/example1.dir/src/SensitiveDetector.cc.o CMakeFiles/example1.dir/src/TrackingAction.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4AsciiReader.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4AsciiReaderMessenger.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4Interface.cc.o -o ../../BuildProducts/bin/example1 -Wl,-rpath,/.../ /.../HepMC3libs ../../BuildProducts/lib/libAdePT_G4_integration_final.so ../../BuildProducts/lib/libAdePT_G4_integration.so ../../BuildProducts/lib/libAdePT.so /.../libvecgeomcuda.so /.../libvgdml.a /.../libvecgeom.a -lrt -lpthread -ldl /.../Geant4libs /.../G4HepEMlibs
Which looks identical to mine... One last thing to try - could you modify the link command in examples/Example1/CMakeLists.txt
to:
cuda_rdc_target_link_libraries(example1
PRIVATE
AdePT_G4_integration
AdePT
VecGeom::vecgeom
${HEPMC3_LIBRARIES}
${HEPMC3_FIO_LIBRARIES}
)
Note the extra VecGeom link. This should fully declare that as needed in the link rather than relying on transitive inclusion. If that still fails, could you also:
As noted in #280, we should wait on the fix to the circular dependency presented there to go through with the existing link structure. I can then rebase here so we can at least decouple these issues (if they are not already orthogonal).
As I commented in #280:
FWIW, I can build the changes here without issue on my local system (Alma 9) with either GCC 11.4 or 12.2 and with CUDA > 12.0 or 12.3, plus a standard VecGeom CUDA enabled install.
and also the CI machine is passing. @agheata, @JuanGonzalezCaminero, which machines are you using, and are they accessible for CERN account holders at all? Are you able to reproduce the errors on lxplus-gpu
? We can discuss further on Mattermost as required, but there's something very specific to your setups that's exposing this problem.
One additional cross-check to try (if @pcanal and @sethrj don't mind):
develop
branch on the systems in question using the Geant4/VecGeom/etc setup as for AdePTThis should expose any fundamental system problems.
@drbenmorgan Building the current Celeritas develop
branch with the following configuration:
CXX=/usr/bin/gcc-10 cmake -S. -B./build
-DCMAKE_INSTALL_PREFIX="../celeritas_installation"
-DCMAKE_PREFIX_PATH="../geant4_installation;../vecgeom_installation"
-DCMAKE_CUDA_ARCHITECTURES=89
I get the following when it tries to link orange-update
[ 65%] Linking CXX executable ../bin/orange-update
/usr/bin/ld: CMakeFiles/orange-update.dir/orange-update.cc.o: undefined reference to symbol '_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE7compareEPKc@@GLIBCXX_3.4.21'
/usr/bin/ld: /lib/x86_64-linux-gnu/libstdc++.so.6: error adding symbols: DSO missing from command line
But in this case it seems unrelated. It seems I can't compile with gcc 12.3, the compiler I used instead is gcc 10.5
@JuanGonzalezCaminero that last error message looks like you're mixing incompatible compilers...
It may well be, yes. What combination of gcc - nvcc are you normally using? @sethrj
On our development machine we use GCC 8 and CUDA 11.5. But the problem you seem to be having is that you've likely compiled VecGeom/Geant4 with one version of GCC and are building Celeritas with a different one.
- Could @pcanal and @sethrj try building this PR in their Celeritas dev environment(s)
This should expose any fundamental system problems.
CMake Error at CMakeLists.txt:76 (message):
AdePT requires at least VecGeom 2.0.0-dev.3, found 1.2.4
@drbenmorgan I don't even see a 2.0.0-dev.3 tag for vecgeom...
CMake Warning at CMakeLists.txt:133 (find_package):
By not providing "FindG4HepEm.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "G4HepEm", but
CMake did not find one.
Could not find a package configuration file provided by "G4HepEm" with any
of the following names:
G4HepEmConfig.cmake
g4hepem-config.cmake
Add the installation prefix of "G4HepEm" to CMAKE_PREFIX_PATH or set
"G4HepEm_DIR" to a directory containing one of the above files. If
"G4HepEm" provides a separate development package or SDK, be sure it has
been installed.
CMake Error at cmake/CudaRdcUtils.cmake:617 (set_target_properties):
Target "AdePT_objects" links to:
G4HepEm::g4HepEm
but the target was not found. Possible reasons include:
* There is a typo in the target name.
* A find_package call is missing for an IMPORTED target.
* An ALIAS target is missing.
Call Stack (most recent call first):
cmake/CudaRdcUtils.cmake:746 (cuda_rdc_use_middle_lib_in_property)
CMakeLists.txt:180 (cuda_rdc_target_link_libraries)
I guess it requires g4hepem as an external install now rather than a builtin?
You should add a check to the main CMakeLists that G4HepEm is built with GPU enabled; otherwise there are a bunch of build errors.
@sethrj, ah *&^% I'd forgotten about these so don't worry about doing the build for now. I have a patch to spack for the vecgeom commit and file for g4hepem if you wish but it's not critical. Let me just triple-check that I can build celeritas develop locally.
@drbenmorgan On my third try I got it working π master commit of g4hepem and vecgeom, on your branch, built with -DCMAKE_CUDA_ARCHITECTURES=70 -DCMAKE_BUILD_TYPE=RelWithDebInfo
and no errors and everything seems to run fine.
Thanks @sethrj, much appreciated π! That was with GCC8, CUDA 11.5 (which OS?)?
linux-rhel8-cascadelake, gcc 8.5.0, CUDA 11.5.
O.k., on my setup (linux-rocky9-cascadelake, gcc 11.4.1, CUDA 12) the same spack env builds both this PR and celeritas develop perfectly (so even Celeritas with Vecgeom 2.0.0-rc2, which is this tag/tarball
@drbenmorgan We just merged the removal of circular dependencies. Could this PR be rebased onto master?
Latest commits are a very quick and dirty rebase on the latest master
, with fixups for the change to header only etc. It works locally for me modulo some device code warnings on flags and that the testField
test doesn't pass. However, I want to see what the CI shows first.
O.k., CI looks good, so @JuanGonzalezCaminero, @agheata could you try building again and see if things are now resolved on your side please?
Still getting one error here:
[100%] Linking CXX executable ../../BuildProducts/bin/example1
/usr/bin/ld: ../../BuildProducts/lib/libAdePT_G4_integration.so: undefined reference to `__cudaRegisterLinkedBinary_5ad212d0_23_AdePTTrackingManager_cu_6f198e90_884112'
Simplified link line:
/usr/bin/c++ -O3 -DNDEBUG CMakeFiles/example1.dir/example1.cpp.o CMakeFiles/example1.dir/__/common/src/ParticleGun.cc.o CMakeFiles/example1.dir/__/common/src/ParticleGunMessenger.cc.o CMakeFiles/example1.dir/__/common/src/FTFP_BERT_HepEm.cc.o CMakeFiles/example1.dir/__/common/src/FTFP_BERT_AdePT.cc.o CMakeFiles/example1.dir/src/ActionInitialisation.cc.o CMakeFiles/example1.dir/src/DetectorConstruction.cc.o CMakeFiles/example1.dir/src/DetectorMessenger.cc.o CMakeFiles/example1.dir/src/EventAction.cc.o CMakeFiles/example1.dir/src/EventActionMessenger.cc.o CMakeFiles/example1.dir/src/SteppingAction.cc.o CMakeFiles/example1.dir/src/SimpleHit.cc.o CMakeFiles/example1.dir/src/PrimaryGeneratorAction.cc.o CMakeFiles/example1.dir/src/RunAction.cc.o CMakeFiles/example1.dir/src/SensitiveDetector.cc.o CMakeFiles/example1.dir/src/TrackingAction.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4AsciiReader.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4AsciiReaderMessenger.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4Interface.cc.o -o ../../BuildProducts/bin/example1 -Wl,-rpath,/.../ HepMC3libs ../../BuildProducts/lib/libAdePT_G4_integration_final.so ../../BuildProducts/lib/libAdePT_G4_integration.so VecGeomlibs -lrt -lpthread -ldl XercesC G4libs G4HepEMlibs
And the same line on Master:
/usr/bin/c++ -O3 -DNDEBUG CMakeFiles/example1.dir/example1.cpp.o CMakeFiles/example1.dir/__/common/src/ParticleGun.cc.o CMakeFiles/example1.dir/__/common/src/ParticleGunMessenger.cc.o CMakeFiles/example1.dir/__/common/src/FTFP_BERT_HepEm.cc.o CMakeFiles/example1.dir/__/common/src/FTFP_BERT_AdePT.cc.o CMakeFiles/example1.dir/src/ActionInitialisation.cc.o CMakeFiles/example1.dir/src/DetectorConstruction.cc.o CMakeFiles/example1.dir/src/DetectorMessenger.cc.o CMakeFiles/example1.dir/src/EventAction.cc.o CMakeFiles/example1.dir/src/EventActionMessenger.cc.o CMakeFiles/example1.dir/src/SteppingAction.cc.o CMakeFiles/example1.dir/src/SimpleHit.cc.o CMakeFiles/example1.dir/src/PrimaryGeneratorAction.cc.o CMakeFiles/example1.dir/src/RunAction.cc.o CMakeFiles/example1.dir/src/SensitiveDetector.cc.o CMakeFiles/example1.dir/src/TrackingAction.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4AsciiReader.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4AsciiReaderMessenger.cc.o CMakeFiles/example1.dir/__/common/src/HepMC3G4Interface.cc.o -o ../../BuildProducts/bin/example1 -L/usr/local/cuda-12.3/targets/x86_64-linux/lib/stubs -L/usr/local/cuda-12.3/targets/x86_64-linux/lib -L/usr/lib/gcc/x86_64-linux-gnu/12 -Wl,-rpath,/.../ ../../BuildProducts/lib/libAdePT_G4_integration.so HepMC3libs VecGeomlibs -lrt -lpthread -ldl XercesC G4libs G4HepEMlibs -lcudadevrt -lcudart_static -lrt -lpthread -ldl
Argh... Could you try adding this line:
target_link_options(example1 PRIVATE "-Wl,--no-as-needed")
at line 64 of examples/example1/CMakeLists.txt please? Are you running on an Ubuntu machine by any chance?
That line solves it indeed. I'm running on Debian
Yep, I found out quite some time ago that VecGeom requires this on Ubuntu systems: see https://wiki.gentoo.org/wiki/Project:Quality_Assurance/-Wl,-z,defs_and_-Wl,--no-allow-shlib-undefined .
Thanks both! @JuanGonzalezCaminero, could you try modifying the line to:
target_link_options(example1 PRIVATE "-Wl,-z,defs")
and see what it prints please? Not a fix, but useful to see what it produces. I'll have a think about how to fix this for installs, but in the meantime I think we can use a project wide no-as-needed
for our own builds.
As far as I can tell I get the same error and link line when using those options. Is there something specific I should look for?
If I understand the option correctly, it should report specifics on what's missing, though perhaps it needs to be applied to the libraries itself? Anyway, I'm now able to reproduce the error locally by adding the --as-needed
flag to my builds so can at leats check fixes locally (which I'll do).
@drbenmorgan it also works for me when adding target_link_options(integrationBenchmark PRIVATE "-Wl,--no-as-needed")
, do you plan to add it globally for now in this PR? Or should we use -Wl,-z,defs
during the creation of the libAdePT_G4_integration libs to force the resolution of all dependencies before the examples are built?
IIUC, -Wl,-z,defs
is only for library developers to warn, rather than fix things (but I need to explore this more). The idea for now would be to use --no-as-needed
globally in AdePT, and then document that this may be needed by clients to resolve things.
As @WitekPokorski reported that the current AdePT libraries had been successfully linked in Gaussino, I'm first going to go back to the celer-adept joint integration layer and see what happens there when fully linking in and using AdePT before deciding how to progress this PR (I also want to go back and update g4vg
and its use as I think that's a higher priority?).
Closing as superseded by #289
Due to VecGeom's legacy CUDA design and the exposure of device functions in public APIs of libraries, ensuring correct singular device code links in final executables is, and continues to be, awkward. This largely means reproducing VecGeom's pattern of shared/static/device linked libraries downstream so that usage can be appropriately tracked and the correct linking done at each stage. A suitable CMake module that packages all of the needed functionality is available from the Celeritas project where it's also been tested across a range of use cases and is therefore now pretty robust (as it can be).
This PR imports this CMake module into AdePT (it's a completely standalone file) and ports the build, link and use of AdePT to use it. The primary changes are:
FindVecGeom
wrapper module to add RDC properties to VecGeom targets (to possibly be upstreamed in VecGeom)target_XXX
calls withcuda_rdc_XXX
equivalentsAdePT_cuda
andAdePT_standalone
libraries thatCudaRdcUtils
build and handles automatically locally and in downstream use.FindVecGeom
andCudaRdcUtils
for use by downstream clientsI've tested this locally on a Rocky 9/GCC 11/Cuda 12 system with an underlying VecGeom install with static libs and CUDA enabled. The build and tests all run fine and without problem. I've also been able to take one of the examples, recast it into a standalone project and build and run it against an install of AdePT. However, CI and wider use may show up additional bugs...
Pinging @pcanal and @sethrj for info and any additional comments they want to make.