DeadSix27 / waifu2x-converter-cpp

Improved fork of Waifu2X C++ using OpenCL and OpenCV
MIT License
792 stars 86 forks source link

Compile Error with CUDA 11 #250

Closed CoelacanthusHex closed 3 years ago

CoelacanthusHex commented 4 years ago

I try to compile it with CUDA 11 but failed.

I guess it is caused by 'Support for Kepler sm_30 architecture based products was dropped in CUDA 11'

log is here:

-- The CXX compiler identification is GNU 10.1.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The C compiler identification is GNU 10.1.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Found git, set version to: v5.3.3 (master-e7de04d)
-- System is: Linux (Linux)
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - found
-- Found OpenCL: /usr/lib/libOpenCL.so (found version "2.2") 
-- Found OpenCV: /usr (found version "4.3.0") 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found CUDA: /opt/cuda (found version "11.0") 
-- 
-- Config summary:
--      OpenCV: 4.3.0
--      OpenCL: 2.2
--      CUDA: 11.0
--      Unicode: TRUE
--      Installing models to: /usr/share/waifu2x-converter-cpp
--      Not building test binaries
--      Building for: Unix Makefiles-x86_64
-- 
-- Configuring done
-- Generating done
-- Build files have been written to: /build/waifu2x-converter-cpp-cuda-git/src/build
Scanning dependencies of target conv
[  4%] Building C object CMakeFiles/conv.dir/conv.c.o
[  8%] Linking C executable conv
[  8%] Built target conv
Scanning dependencies of target gensrcs
[ 17%] Generating modelHandler_CUDA.ptx30
[ 17%] Generating modelHandler_OpenCL.cl.h
nvcc fatal   : Value 'sm_30' is not defined for option 'gpu-architecture'
make[2]: *** [CMakeFiles/gensrcs.dir/build.make:91: modelHandler_CUDA.ptx30] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:128: CMakeFiles/gensrcs.dir/all] Error 2
make: *** [Makefile:160: all] Error 2

System Info: Arch Linux Rolling 2020-07-25 CUDA 11.0.2-1

DDoSolitary commented 4 years ago

Same here

ploober commented 4 years ago

Hello friends,

I was having this error also when trying to compile this project on Mint 20 (Ubuntu 20.04) with CUDA 11.0 (from nvidia apt ubuntu repo).

I was able to get it working with some small modifications, see: https://github.com/ploober/waifu2x-converter-cpp/commit/3c811819e236beae4c0c27cc87b643f6bb942792

To summarize the change: If CUDA 11+ is detected, it will only build with a model handler for -arch=sm_60, which avoids the issue with -arch=sm_30 being unsupported in CUDA 11. If CUDA below version 11 is detected it will fall back to the original behavior.

@DeadSix27 if this is suitable for PR I can put it up. If not that's fine, I see there is #166 where this could potentially be addressed in a more universal way.

sl1pkn07 commented 4 years ago

workground(?)

diff --git a/CMakeLists.txt b/CMakeLists.txt
index a13b35d..c7941af 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -378,20 +378,34 @@ if(CUDA_FOUND)
             DEPENDS src/modelHandler_CUDA.cu
         )
     endif()
+    if (CUDA_VERSION_MAJOR LESS 10)
+        add_custom_command(
+            OUTPUT modelHandler_CUDA.ptx30.h
+            COMMAND ${CONV_EXE} modelHandler_CUDA.ptx30 modelHandler_CUDA.ptx30.h str
+            DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx20 conv
+        )
+        add_custom_command(
+            OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx30
+            COMMAND ${CUDA_NVCC_EXECUTABLE} ${CUDA_NVCC_FLAGS} -arch=sm_30 -ptx -o ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx30 ${CMAKE_CURRENT_SOURCE_DIR}/src/modelHandler_CUDA.cu
+            DEPENDS src/modelHandler_CUDA.cu
+        )
+    endif()
     add_custom_command(
-        OUTPUT modelHandler_CUDA.ptx30.h
-        COMMAND ${CONV_EXE} modelHandler_CUDA.ptx30 modelHandler_CUDA.ptx30.h str
-        DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx30 conv
+        OUTPUT modelHandler_CUDA.ptx35.h
+        COMMAND ${CONV_EXE} modelHandler_CUDA.ptx35 modelHandler_CUDA.ptx35.h str
+        DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx35 conv
     )
     add_custom_command(
-        OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx30
-        COMMAND ${CUDA_NVCC_EXECUTABLE} ${CUDA_NVCC_FLAGS} -arch=sm_30 -ptx -o ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx30 ${CMAKE_CURRENT_SOURCE_DIR}/src/modelHandler_CUDA.cu
+        OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx35
+        COMMAND ${CUDA_NVCC_EXECUTABLE} ${CUDA_NVCC_FLAGS} -arch=sm_35 -ptx -o ${CMAKE_CURRENT_BINARY_DIR}/modelHandler_CUDA.ptx35 ${CMAKE_CURRENT_SOURCE_DIR}/src/modelHandler_CUDA.cu
         DEPENDS src/modelHandler_CUDA.cu
     )
     if (CUDA_VERSION_MAJOR LESS 9)
         set(GPU_CODE ${GPU_CODE} modelHandler_CUDA.ptx30.h modelHandler_CUDA.ptx20.h)
-    else()
+    elseif (CUDA_VERSION_MAJOR LESS 10)
         set(GPU_CODE ${GPU_CODE} modelHandler_CUDA.ptx30.h)
+    else()
+        set(GPU_CODE ${GPU_CODE} modelHandler_CUDA.ptx35.h)
     endif()
 endif()

diff --git a/src/modelHandler_CUDA.cpp b/src/modelHandler_CUDA.cpp
index af879fe..429200c 100644
--- a/src/modelHandler_CUDA.cpp
+++ b/src/modelHandler_CUDA.cpp
@@ -47,9 +47,14 @@ static void *handle;
         static const char prog20[] =
             #include "modelHandler_CUDA.ptx20.h"
             ;
-    #endif // CUDART_VERSION < 9000
-    static const char prog30[] =
-        #include "modelHandler_CUDA.ptx30.h"
+    #endif // CUDART_VERSION < 10000
+    #if CUDART_VERSION < 10000
+        static const char prog30[] =
+            #include "modelHandler_CUDA.ptx30.h"
+            ;
+    #endif // CUDART_VERSION < 10000
+    static const char prog35[] =
+        #include "modelHandler_CUDA.ptx35.h"
    ;
 #endif // HAVE_CUDA

@@ -172,9 +177,15 @@ namespace w2xc
            return false;
        }

-       const char *prog = prog30;
+       const char *prog = prog35;
        // cuda 9.0 doesn't support Compute 20

+#if CUDART_VERSION < 10000
+       if (cap_major < 3)
+       {
+           prog = prog30;
+       }
+#endif // CUDART_VERSION < 10000
 #if CUDART_VERSION < 9000
        if (cap_major < 3)
        {
jeffshee commented 4 years ago

If you do not necessarily need to build the whole thing by yourself, you can get the AppImage here: https://github.com/jeffshee/waifu2x-appimage/releases/tag/v5.3.3a

ploober commented 4 years ago

If anyone looking at this issue is still experiencing the error and wants a one-line fix: sed -i 's/arch=sm_20/arch=sm_60/g' CMakeLists.txt && sed -i 's/arch=sm_30/arch=sm_60/g' CMakeLists.txt Then run cmake just as you would normally.

Feel free to replace "sm_60" with your preferred value: https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/

YukihoAA commented 3 years ago

fixed by PR #252