GOMC-WSU / GOMC

GOMC - GPU Optimized Monte Carlo is a parallel molecular simulation code designed for high-performance simulation of large systems
https://gomc-wsu.org
MIT License
76 stars 36 forks source link

GOMC GPU binaries dp not build properly with CUDA 12.2 #522

Open bc118 opened 10 months ago

bc118 commented 10 months ago

Describe the bug GOMC does not build properly in GPU with CUDA 12.2. It looks like the "compute_35" needs removed from the base GOMC code for current and development versions.

Error produced nvcc fatal : Unsupported gpu architecture 'compute_35' nvcc fatal : Unsupported gpu architecture 'compute_35' make[3]: *** [CMakeFiles/GPU_NVT.dir/build.make:76: CMakeFiles/GPU_NVT.dir/src/GPU/CalculateEnergyCUDAKernel.cu.o] Error 1 make[3]: *** Waiting for unfinished jobs.... make[3]: *** [CMakeFiles/GPU_NVT.dir/build.make:104: CMakeFiles/GPU_NVT.dir/src/GPU/CalculateEwaldCUDAKernel.cu.o] Error 1 nvcc fatal : Unsupported gpu architecture 'compute_35' make[3]: *** [CMakeFiles/GPU_NVT.dir/build.make:132: CMakeFiles/GPU_NVT.dir/src/GPU/CUDAMemoryManager.cu.o] Error 1 nvcc fatal : Unsupported gpu architecture 'compute_35' nvcc fatal : Unsupported gpu architecture 'compute_35' make[3]: *** [CMakeFiles/GPU_NVT.dir/build.make:90: CMakeFiles/GPU_NVT.dir/src/GPU/CalculateForceCUDAKernel.cu.o] Error 1 make[3]: *** [CMakeFiles/GPU_NVT.dir/build.make:118: CMakeFiles/GPU_NVT.dir/src/GPU/ConstantDefinitionsCUDAKernel.cu.o] Error 1 nvcc fatal : Unsupported gpu architecture 'compute_35' make[3]: *** [CMakeFiles/GPU_NVT.dir/build.make:146: CMakeFiles/GPU_NVT.dir/src/GPU/TransformParticlesCUDAKernel.cu.o] Error 1 make[2]: *** [CMakeFiles/Makefile2:175: CMakeFiles/GPU_NVT.dir/all] Error 2 make[1]: *** [CMakeFiles/Makefile2:182: CMakeFiles/GPU_NVT.dir/rule] Error 2

Please complete the following information:

GregorySchwing commented 10 months ago

This is due to us being stuck between a rock and a hard place. Require a newish cmake version which detects architecture or allow old versions and try to guess all architectures a user might need. I can drop 35, but this is the reasoning.

LSchwiebert commented 9 months ago

Greg is correct. CUDA 11 deprecated compute capability 3.5, which was the old Kepler cards. CUDA 12 removed support.

I'm planning a new CMake that will require users to have a newer version of CMake for some other reasons, but the one that builds for the appropriate architectures is something like CMake 3.24, which is pretty new. Removing 35 from the list is the simplest workaround, but not a long term solution. The new CMake will test the CMake version and build accordingly, so only those running an older version of CMake will have to work around this problem.

If you are curious, you can look at the CMake files in the build-issues branch, which I hope to have ready for review soon...

bc118 commented 9 months ago

Yeah, when you get it let me know. Not a huge issue at the moment, as I can just wait or run CUDA 11.8. However, I thought it should be noted as here, as it is likely important to other users soon.