gpufit / Gpufit

GPU-accelerated Levenberg-Marquardt curve fitting in CUDA
MIT License
309 stars 91 forks source link

Kepler architecture is deprecated from CUDA 11 and nvcc throws an error #87

Closed lmmx closed 3 years ago

lmmx commented 3 years ago

When trying to build Gpufit on Linux with CUDA 11, I received an error that compute_30 was an "unsupported architecture".

nvcc fatal   : Unsupported gpu architecture 'compute_30'
CMake Error at Gpufit_generated_cuda_gaussjordan.cu.o.RELEASE.cmake:222 (message):
  Error generating
  /home/louis/dev/gpufit_dev/gpufit-build/Gpufit/CMakeFiles/Gpufit.dir//./Gpufit_generated_cuda_gaussjordan.cu.o

make[2]: *** [Gpufit/CMakeFiles/Gpufit.dir/build.make:93:
Gpufit/CMakeFiles/Gpufit.dir/Gpufit_generated_cuda_gaussjordan.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1114: Gpufit/CMakeFiles/Gpufit.dir/all] Error 2
make: *** [Makefile:95: all] Error 2

This arises since I have CUDA 11 however the first 2 architectures in the CUDA_ARCHITECTURES list are 3.0 and 3.5:

-- CUDA_ARCHITECTURES=3.0;3.5;5.0;5.2;3.2;3.7;5.3;6.0;6.1;6.2;7.0+PTX
-- CUDA_NVCC_FLAGS=-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,...

The problem was that since CUDA 11, compute_30 is deprecated, as documented here

Fermi† Kepler† Maxwell‡ Pascal Volta Turing Ampere Lovelace* Hopper**
sm_20 sm_30 sm_50 sm_60 sm_70 sm_75 sm_80 sm_90? sm_100c?
sm_35 sm_52 sm_61 sm_72 sm_86
sm_37 sm_53 sm_62

† Fermi and Kepler are deprecated from CUDA 9 and 11 onwards

‡ Maxwell is deprecated from CUDA 12 onwards

* Lovelace is the microarchitecture replacing Ampere (AD102)

** Hopper is NVIDIA’s rumored “tesla-next” series, with a 5nm process.

It was also mentioned in a comment on the post that "Support for Kepler sm_30 and sm_32 architecture based products is dropped.", i.e. 3.2 should go alongside the 3.0, 3.5, and 3.7 in the table within the blog post (reproduced above)

This explains why compute_30 was throwing an error and suggests how to fix it. Simply test if the CUDA version is greater than or equal to 11, and then skip the architectures from 37 and below.

To resolve this I changed the Gpufit/Gpufit/CMakeLists.txt to create an empty list instead, and while I was fixing it for my current architecture I thought I should also future proof it for CUDA 12 and post it here

elseif( CUDA_ARCHITECTURES STREQUAL All )
# All does not include the latest PTX!
  set( CUDA_ARCHITECTURES "" )
  if( CUDA_VERSION VERSION_LESS "11.0" )
    list( INSERT CUDA_ARCHITECTURES "3.0" "3.5" )
  endif()
  if( CUDA_VERSION VERSION_LESS "12.0" )
    list( INSERT CUDA_ARCHITECTURES "5.0" "5.2" )
  endif()
  if( CUDA_VERSION VERSION_LESS "9.0" )
    list( INSERT CUDA_ARCHITECTURES 0 "2.0" "2.1(2.0)" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "6.5" )
    if( CUDA_VERSION VERSION_LESS "11.0" )
      list( APPEND CUDA_ARCHITECTURES "3.2" "3.7" )
    endif()
    if( CUDA_VERSION VERSION_LESS "12.0" )
      list( APPEND CUDA_ARCHITECTURES "5.3" )
    endif()
  endif()
  if( CUDA_VERSION VERSION_GREATER "7.5" )
    list( APPEND CUDA_ARCHITECTURES "6.0" "6.1" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "8.0" )
    list( APPEND CUDA_ARCHITECTURES "6.2" "7.0" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "9.0" )
    list( APPEND CUDA_ARCHITECTURES "6.2" "7.0" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "10.0" )
    list( APPEND CUDA_ARCHITECTURES "7.5" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "11.0" )
    list( APPEND CUDA_ARCHITECTURES "8.0" "8.6" )
  endif()
  string( APPEND CUDA_ARCHITECTURES "+PTX" )
endif()

Happy to submit a pull request with this if you'd like me to.

ptbrown1729 commented 3 years ago

I ran into this same issue with Cuda version 11 on Linux Mint 19.3 and CMake 3.20.1

Trying the above code, I got an error from CMake from at the line list( INSERT CUDA_ARCHITECTURES "5.0" "5.2" ) Looks like the insert command is missing the list index

I changed this to list( INSERT CUDA_ARCHITECTURES 0 "5.0" "5.2" ) and CMake succeeded

lmmx commented 3 years ago

Ahhh that would explain it, I just cheated and specified the architectures I wanted explicitly, thanks

jkfindeisen commented 3 years ago

Thanks for reporting the issue and the extensive documentation. In the most recent commit, I adapted CMake with your proposed changes including the comments by ptbrown above.

gpufit commented 3 years ago

The automatic CUDA architecture assignment has been revised in the latest commit.