Kepler architecture is deprecated from CUDA 11 and nvcc throws an error

lmmx commented 3 years ago

When trying to build Gpufit on Linux with CUDA 11, I received an error that compute_30 was an "unsupported architecture".

nvcc fatal   : Unsupported gpu architecture 'compute_30'
CMake Error at Gpufit_generated_cuda_gaussjordan.cu.o.RELEASE.cmake:222 (message):
  Error generating
  /home/louis/dev/gpufit_dev/gpufit-build/Gpufit/CMakeFiles/Gpufit.dir//./Gpufit_generated_cuda_gaussjordan.cu.o

make[2]: *** [Gpufit/CMakeFiles/Gpufit.dir/build.make:93:
Gpufit/CMakeFiles/Gpufit.dir/Gpufit_generated_cuda_gaussjordan.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1114: Gpufit/CMakeFiles/Gpufit.dir/all] Error 2
make: *** [Makefile:95: all] Error 2

This arises since I have CUDA 11 however the first 2 architectures in the CUDA_ARCHITECTURES list are 3.0 and 3.5:

-- CUDA_ARCHITECTURES=3.0;3.5;5.0;5.2;3.2;3.7;5.3;6.0;6.1;6.2;7.0+PTX
-- CUDA_NVCC_FLAGS=-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,...

The problem was that since CUDA 11, compute_30 is deprecated, as documented here

Fermi† Kepler† Maxwell‡ Pascal Volta Turing Ampere Lovelace* Hopper**

sm_20 sm_30 sm_50 sm_60 sm_70 sm_75 sm_80 sm_90? sm_100c?

sm_35 sm_52 sm_61 sm_72 sm_86

sm_37 sm_53 sm_62

† Fermi and Kepler are deprecated from CUDA 9 and 11 onwards

‡ Maxwell is deprecated from CUDA 12 onwards

* Lovelace is the microarchitecture replacing Ampere (AD102)

** Hopper is NVIDIA’s rumored “tesla-next” series, with a 5nm process.

Fermi†	Kepler†	Maxwell‡	Pascal	Volta	Turing	Ampere	Lovelace*	Hopper**
sm_20	sm_30	sm_50	sm_60	sm_70	sm_75	sm_80	sm_90?	sm_100c?
	sm_35	sm_52	sm_61	sm_72		sm_86
	sm_37	sm_53	sm_62

It was also mentioned in a comment on the post that "Support for Kepler sm_30 and sm_32 architecture based products is dropped.", i.e. 3.2 should go alongside the 3.0, 3.5, and 3.7 in the table within the blog post (reproduced above)

As noted, Fermi and Kepler are deprecated from CUDA 9 and 11 upwards (respectively I presume), i.e. in CUDA 11, Kepler is deprecated, and with it (30, 32, 35, 37)
Additionally, note that from CUDA 12 the Maxwell architectures (50, 52, 53) will be deprecated

This explains why compute_30 was throwing an error and suggests how to fix it. Simply test if the CUDA version is greater than or equal to 11, and then skip the architectures from 37 and below.

To resolve this I changed the Gpufit/Gpufit/CMakeLists.txt to create an empty list instead, and while I was fixing it for my current architecture I thought I should also future proof it for CUDA 12 and post it here

elseif( CUDA_ARCHITECTURES STREQUAL All )
# All does not include the latest PTX!
  set( CUDA_ARCHITECTURES "" )
  if( CUDA_VERSION VERSION_LESS "11.0" )
    list( INSERT CUDA_ARCHITECTURES "3.0" "3.5" )
  endif()
  if( CUDA_VERSION VERSION_LESS "12.0" )
    list( INSERT CUDA_ARCHITECTURES "5.0" "5.2" )
  endif()
  if( CUDA_VERSION VERSION_LESS "9.0" )
    list( INSERT CUDA_ARCHITECTURES 0 "2.0" "2.1(2.0)" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "6.5" )
    if( CUDA_VERSION VERSION_LESS "11.0" )
      list( APPEND CUDA_ARCHITECTURES "3.2" "3.7" )
    endif()
    if( CUDA_VERSION VERSION_LESS "12.0" )
      list( APPEND CUDA_ARCHITECTURES "5.3" )
    endif()
  endif()
  if( CUDA_VERSION VERSION_GREATER "7.5" )
    list( APPEND CUDA_ARCHITECTURES "6.0" "6.1" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "8.0" )
    list( APPEND CUDA_ARCHITECTURES "6.2" "7.0" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "9.0" )
    list( APPEND CUDA_ARCHITECTURES "6.2" "7.0" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "10.0" )
    list( APPEND CUDA_ARCHITECTURES "7.5" )
  endif()
  if( CUDA_VERSION VERSION_GREATER "11.0" )
    list( APPEND CUDA_ARCHITECTURES "8.0" "8.6" )
  endif()
  string( APPEND CUDA_ARCHITECTURES "+PTX" )
endif()

I cross referenced against the CUDA docs for Ampere, Volta, and Turing
I considered using VERSION_GREATER_EQUAL but apparently this breaks backward compatibility with cmake pre-3.7
When building, I had to skip this control flow block altogether by passing cmake -DCUDA_ARCHITECTURES="8.0 8.6+PTX", perhaps this could be recommended in the docs.

Happy to submit a pull request with this if you'd like me to.

Edit I summarised my installation into a brief guide here in case it's any use to anyone else!

ptbrown1729 commented 3 years ago

I ran into this same issue with Cuda version 11 on Linux Mint 19.3 and CMake 3.20.1

Trying the above code, I got an error from CMake from at the line list( INSERT CUDA_ARCHITECTURES "5.0" "5.2" ) Looks like the insert command is missing the list index

I changed this to list( INSERT CUDA_ARCHITECTURES 0 "5.0" "5.2" ) and CMake succeeded

lmmx commented 3 years ago

Ahhh that would explain it, I just cheated and specified the architectures I wanted explicitly, thanks

jkfindeisen commented 3 years ago

Thanks for reporting the issue and the extensive documentation. In the most recent commit, I adapted CMake with your proposed changes including the comments by ptbrown above.

gpufit commented 3 years ago

The automatic CUDA architecture assignment has been revised in the latest commit.

gpufit / Gpufit

Kepler architecture is deprecated from CUDA 11 and nvcc throws an error #87