FLAMEGPU / FLAMEGPU2

FLAME GPU 2 is a GPU accelerated agent based modelling framework for CUDA C++ and Python
https://flamegpu.com
MIT License
99 stars 19 forks source link

MPI: CMake 3.18+ support #1114

Closed Robadob closed 6 months ago

Robadob commented 9 months ago

Device link was failing building tests_mpi target of distributed_ensemble branch on Bede with CMake 3.18, but fixed when using CMake 3.22.

May need to consider updating minimum required CMake version.

CMake 3.18

[ 93%] Linking CUDA device code CMakeFiles/tests_mpi.dir/cmake_device_link.o
cd /users/robadob/fgpu2/build/tests && /opt/software/builder/developers/tools/cmake/3.18.4/1/default/bin/cmake -E cmake_link_script CMakeFiles/tests_mpi.dir/dlink.txt --verbose=1
/opt/software/builder/developers/compilers/cuda/11.4.1/1/default/bin/nvcc -forward-unknown-to-host-compiler -O3 -DNDEBUG --generate-code=arch=compute_70,code=[compute_70,sm_70] -Wno-deprecated-gpu-targets -Xcompiler=-Wl,-rpath,-Wl,/opt/software/builder/developers/libraries/openmpi/4.0.5/1/gcc-native-cuda-11.4.1/lib,-Wl,--enable-new-dtags,-pthread -Xcompiler=-fPIC -Wno-deprecated-gpu-targets -shared -dlink CMakeFiles/tests_mpi.dir/test_cases/simulation/test_mpi_ensemble.cu.o CMakeFiles/tests_mpi.dir/helpers/host_reductions_common.cu.o CMakeFiles/tests_mpi.dir/helpers/device_initialisation.cu.o CMakeFiles/tests_mpi.dir/helpers/main.cu.o -o CMakeFiles/tests_mpi.dir/cmake_device_link.o   -L/opt/software/builder/developers/compilers/cuda/11.4.1/1/default/targets/ppc64le-linux/lib/stubs  ../lib/Release/libflamegpu.a ../lib/libgtest.a  ../lib/Release/libtinyxml2.a -lstdc++fs -ldl -lcudadevrt -lcudart  -L"/opt/software/builder/developers/compilers/cuda/11.4.1/1/default/lib64"
gcc: error: unrecognized command-line option ‘-Wl’; did you mean ‘-W’?
gcc: error: unrecognized command-line option ‘-rpath’
gcc: error: unrecognized command-line option ‘-Wl’; did you mean ‘-W’?
gcc: error: unrecognized command-line option ‘-Wl’; did you mean ‘-W’?
gmake[3]: *** [tests/CMakeFiles/tests_mpi.dir/build.make:155: tests/CMakeFiles/tests_mpi.dir/cmake_device_link.o] Error 1

CMake 3.22

[ 97%] Linking CUDA device code CMakeFiles/tests_mpi.dir/cmake_device_link.o
cd /users/robadob/fgpu2/build/tests && /users/robadob/miniconda/miniconda/envs/cmake/bin/cmake -E cmake_link_script CMakeFiles/tests_mpi.dir/dlink.txt --verbose=1
/opt/software/builder/developers/compilers/cuda/11.4.1/1/default/bin/nvcc -forward-unknown-to-host-compiler -O3 -DNDEBUG --generate-code=arch=compute_70,code=[compute_70,sm_70] -Wno-deprecated-gpu-targets -Xcompiler=-fPIC -Wno-deprecated-gpu-targets -shared -dlink CMakeFiles/tests_mpi.dir/test_cases/simulation/test_mpi_ensemble.cu.o CMakeFiles/tests_mpi.dir/helpers/host_reductions_common.cu.o CMakeFiles/tests_mpi.dir/helpers/device_initialisation.cu.o CMakeFiles/tests_mpi.dir/helpers/main.cu.o -o CMakeFiles/tests_mpi.dir/cmake_device_link.o   -L/opt/software/builder/developers/compilers/cuda/11.4.1/1/default/targets/ppc64le-linux/lib/stubs  ../lib/Release/libflamegpu.a ../lib/libgtest.a  ../lib/Release/libtinyxml2.a -ldl -lpthread -lcudadevrt -lcudart  -L"/opt/software/builder/developers/compilers/cuda/11.4.1/1/default/lib64"
Robadob commented 9 months ago

@ptheywood's research from Slack chat

Looks like it's an MPI specific thing

https://discourse.cmake.org/t/unable-to-link-cuda-device-code-with-mpich-implementation/3006/8

Which leads to an issue that was fixed by a merge request.

https://gitlab.kitware.com/cmake/cmake/-/issues/21887

https://gitlab.kitware.com/cmake/cmake/-/merge_requests/5966 is the mr

Which looks like was part of 3.20.1

ptheywood commented 9 months ago

Unable to reproduce this on x86_64 ubuntu machines, which dont' seem to require the flags being passed.

We probably just want to warn on <= 3.20.1 that it might error as a dev warning, as its not a universal mpi + old cmake error.

ptheywood commented 9 months ago

I've just confirmed on Bede that CMake 3.20.0 fails to link with the error above, while 3.20.1 does work (when MPI is enabled and the MPI installation requires some extra env variables passing to the host linker, i.e. the OpenMPI install on Bede).

So adding a message(WARNING ...) when MPI is enabled and found, but CMake is < 3.20.1 as part of #1090 would be the way to address this (I'll quickly add one).