build issue cuda 11.3 - Githubissues

azazellochg commented 3 years ago

I have problem compiling relion4 with cuda 11

Environment:

OS: Debian, kernel 5.14.0-2
OpenMPI 4.1.2rc1
gcc 10.3.0
nvidia driver 470.74
nvcc cuda_11.3.r11.3/compiler.29920130_0
RELION version 4.0 86fe8da5d06329e89ed7cf2648a7cdc38cefa079
GPU: RTX3090

Command: cmake -DCUDA_ARCH=86 -DFORCE_OWN_FFTW=ON -DAMDFFTW=ON .. && make -j 12 CMakeCache.txt Error:

/usr/include/thrust/system/cuda/config.h(107): error: "#" not expected here

/usr/include/thrust/system/cuda/config.h(107): error: expected a ";"

/usr/include/thrust/system/cuda/config.h(107): error: "#" not expected here

/usr/include/thrust/system/cuda/config.h(107): error: expected a ";"

/usr/include/thrust/system/cuda/config.h(107): error: "#" not expected here

/usr/include/thrust/system/cuda/config.h(107): error: expected a ";"

2 errors detected in the compilation of "/home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_projector_plan.cu".
/usr/include/thrust/system/cuda/config.h(107): error: "#" not expected here

/usr/include/thrust/system/cuda/config.h(107): error: expected a ";"

CMake Error at relion_gpu_util_generated_cuda_projector_plan.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/gsharov/soft/relion-4.0/build/src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/./relion_gpu_util_generated_cuda_projector_plan.cu.o

make[2]: *** [src/apps/CMakeFiles/relion_gpu_util.dir/build.make:4000: src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_projector_plan.cu.o] Error 1
make[2]: *** Waiting for unfinished jobs....
2 errors detected in the compilation of "/home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_helper_functions.cu".
2 errors detected in the compilation of "/home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_autopicker.cu".
CMake Error at relion_gpu_util_generated_cuda_helper_functions.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/gsharov/soft/relion-4.0/build/src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/./relion_gpu_util_generated_cuda_helper_functions.cu.o

make[2]: *** [src/apps/CMakeFiles/relion_gpu_util.dir/build.make:2045: src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_helper_functions.cu.o] Error 1
CMake Error at relion_gpu_util_generated_cuda_autopicker.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/gsharov/soft/relion-4.0/build/src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/./relion_gpu_util_generated_cuda_autopicker.cu.o

make[2]: *** [src/apps/CMakeFiles/relion_gpu_util.dir/build.make:731: src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_autopicker.cu.o] Error 1
2 errors detected in the compilation of "/home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_ml_optimiser.cu".
CMake Error at relion_gpu_util_generated_cuda_ml_optimiser.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/gsharov/soft/relion-4.0/build/src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/./relion_gpu_util_generated_cuda_ml_optimiser.cu.o

make[2]: *** [src/apps/CMakeFiles/relion_gpu_util.dir/build.make:3060: src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_ml_optimiser.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:2558: src/apps/CMakeFiles/relion_gpu_util.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

biochem-fan commented 3 years ago

Where is your /usr/include/thrust from? Older versions of thrust is not compatible with CUDA 11.

azazellochg commented 3 years ago

It comes from libthrust-dev 1.14.0-1

azazellochg commented 2 years ago

The issue is resolved once the nvidia driver was updated to 470.82.00

FraaaMazzz commented 2 years ago

Hello! I am having similar issue with the Relion installation.

This is the error I get.

nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/local/cuda/include/thrust/system/cuda/config.h(107): error: "#" not expected here

/usr/local/cuda/include/thrust/system/cuda/config.h(107): error: expected a ";"

2 errors detected in the compilation of "/home/cryo-em/Desktop/software/em/relion-3.1.3/src/acc/cuda/cuda_projector_plan.cu".
CMake Error at relion_gpu_util_generated_cuda_projector_plan.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/cryo-em/Desktop/software/em/relion-3.1.3/src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/./relion_gpu_util_generated_cuda_projector_plan.cu.o

make[2]: *** [src/apps/CMakeFiles/relion_gpu_util.dir/build.make:114: src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_projector_plan.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:612: src/apps/CMakeFiles/relion_gpu_util.dir/all] Error 2
make: *** [Makefile:130: all] Error 2
Traceback (most recent call last):
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/__main__.py", line 474, in <module>
    main()
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/__main__.py", line 297, in main
    installPluginMethods()
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/install/install_plugin.py", line 259, in installPluginMethods
    pinfo.installBin({'args': [binTarget, '-j', numberProcessor]})
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/install/plugin_funcs.py", line 166, in installBin
    environment.execute()
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/install/funcs.py", line 748, in execute
    self._executeTargets(targetList)
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/install/funcs.py", line 690, in _executeTargets
    tgt.execute()
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/install/funcs.py", line 221, in execute
    command.execute()
  File "/home/cryo-em/miniconda3/envs/scipion3/lib/python3.8/site-packages/scipion/install/funcs.py", line 161, in execute
    assert glob(t), ("target '%s' not built (after "
AssertionError: target '/home/cryo-em/Desktop/software/em/relion-3.1.3/bin/relion_refine' not built (after running 'make -j 1')
Error at main: target '/home/cryo-em/Desktop/software/em/relion-3.1.3/bin/relion_refine' not built (after running 'make -j 1')

Can anyone help me?

biochem-fan commented 2 years ago

First, please check your CUDA SDK and driver versions.

Next, please try compiling via cmake and make. Compilation via a wrapper adds another layer of complexity. It hides exactly what is happening. Only after you succeed in building yourself, try wrappers.

FraaaMazzz commented 2 years ago

Nvcc --version give a weird output (don't understand why) --> /usr/lib/cuda/bin/nvcc: 3: exec: /usr/lib/nvidia-cuda-toolkit/bin/nvcc: not found

But nvidia-smi prints out this information:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.39.01    Driver Version: 510.39.01    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:17:00.0 Off |                  N/A |
| 37%   48C    P2   110W / 350W |   2462MiB / 24576MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  On   | 00000000:65:00.0  On |                  N/A |
| 34%   44C    P8    40W / 350W |    354MiB / 24576MiB |      6%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2565      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A      3262      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A    260163      C   ...ryolo-1.7.6/bin/python3.6     2449MiB |
|    1   N/A  N/A      2565      G   /usr/lib/xorg/Xorg                 53MiB |
|    1   N/A  N/A      3262      G   /usr/lib/xorg/Xorg                175MiB |
|    1   N/A  N/A      3390      G   /usr/bin/gnome-shell               61MiB |
|    1   N/A  N/A      4728      G   ...mviewer/tv_bin/TeamViewer       14MiB |
|    1   N/A  N/A     10297      G   ...AAAAAAAAA= --shared-files       30MiB |
+-----------------------------------------------------------------------------+

When I run cmake .. I get some error This is what I get:

-- BUILD TYPE set to the default type:  'Release'
-- Setting fallback CUDA_ARCH=35
-- ALLOW_CTF_IN_SAGD enabled - This build of RELION allows modulation of particle images by a contrast transfer function inside stochastic average gradient descent, as specified in Claim 1 of patent US10,282,513B2
-- CUDA enabled - Building CUDA-accelerated version of RELION
-- Setting cpu precision to double
-- Setting accelerated code precision to single
-- Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS CUDA_CUDART_LIBRARY) (found version ".")
-- Using non-cuda compilation....
-- MPI_INCLUDE_PATH : /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi;/usr/lib/x86_64-linux-gnu/openmpi/include
-- MPI_LIBRARIES : /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- MPI_CXX_INCLUDE_PATH : /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi;/usr/lib/x86_64-linux-gnu/openmpi/include
-- MPI_CXX_LIBRARIES : /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- CMAKE_C_COMPILER : /usr/bin/cc
-- CMAKE_CXX_COMPILER : /usr/bin/c++
-- MPI_C_COMPILER : /usr/bin/mpicc
-- MPI_CXX_COMPILER : /usr/bin/mpicxx
-- CMAKE_CXX_COMPILER_ID : GNU
-- Could NOT find FLTK (missing: FLTK_LIBRARIES FLTK_INCLUDE_DIR FLTK_FLUID_EXECUTABLE) 
-- No FLTK installation was found
-- --------------------------------------------------------
-- -------- NO EXISTING FLTK LIBRARIES WHERE FOUND. -------
-- -------------- FLTK WILL BE DOWNLOADED AND -------------
-- --------------- BUILT DURING COMPILE-TIME. -------------
-- --------------------------------------------------------
-- ---- A WORKING INTERNET CONNECTION WILL BE REQUIRED. ---
-- --------------------------------------------------------
-- no previous fltk found, the following paths are set for libs/headers TO BE built
-- FLTK_INCLUDE_DIR: /home/cryo-em/Desktop/software/em/relion/external/fltk/include
-- FLTK_LIBRARIES:   /home/cryo-em/Desktop/software/em/relion/external/fltk/lib/libfltk.so
-- Found FFTW
-- FFTW_PATH: /usr/include
-- FFTW_INCLUDES: /usr/include
-- FFTW_LIBRARIES: /usr/lib/x86_64-linux-gnu/libfftw3f.so;/usr/lib/x86_64-linux-gnu/libfftw3.so
BUILD_SHARED_LIBS = OFF
-- Building static libs (larger build size and binaries)
Running apps/CMakeLists.txt...
-- CMAKE_BINARY_DIR:/home/cryo-em/Desktop/software/em/relion/build
-- Git commit ID: 72bbf0c06cea68f8992328703ee5ae5f3d1fc9b7
-- Found OpenMP_C: -fopenmp  
-- Found OpenMP_CXX: -fopenmp  
-- Found OpenMP: TRUE   
-- Configuring done
-- Generating done
-- Build files have been written to: /home/cryo-em/Desktop/software/em/relion/build

If I move forward with make these are the last lines I get:

checking for pkg-config... /usr/bin/pkg-config
Package xft was not found in the pkg-config search path.
Perhaps you should add the directory containing `xft.pc'
to the PKG_CONFIG_PATH environment variable
No package 'xft' found
Package freetype2 was not found in the pkg-config search path.
Perhaps you should add the directory containing `freetype2.pc'
to the PKG_CONFIG_PATH environment variable
No package 'freetype2' found
checking for freetype-config... no
configure: please install pkg-config or use 'configure --disable-xft'.
configure: error: Aborting.
make[2]: *** [CMakeFiles/OWN_FLTK.dir/build.make:110: OWN_FLTK-prefix/src/OWN_FLTK-stamp/OWN_FLTK-configure] Error 1
make[1]: *** [CMakeFiles/Makefile2:243: CMakeFiles/OWN_FLTK.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

biochem-fan commented 2 years ago

Your CUDA installation is broken. Fix it first.

You also have to install xft. This is mentioned in the documentation https://relion.readthedocs.io/en/release-4.0/Installation.html.

FraaaMazzz commented 2 years ago

Seems to be working!

For the CUDA installation being broken I ran: sudo apt update --fix-missing But nvcc --version still generated the same result (/usr/lib/cuda/bin/nvcc: 3: exec: /usr/lib/nvidia-cuda-toolkit/bin/nvcc: not found)

For the xft missing I ran sudo apt-get install -y libxft-dev which for some reason did not get installed when I ran the command reported in this documentation.

Alexamk commented 2 years ago

Hello, I'm having very similar issues. Also with CUDA version 11.6 and Nvidia driver 510.47.03.

the cmake command doesn't seem to give any issues:

-- BUILD TYPE set to the default type:  'Release'
-- Using provided CUDA_ARCH=61
-- ALLOW_CTF_IN_SAGD enabled - This build of RELION allows modulation of particle images by a contrast transfer function inside stochastic average gradient descent, as specified in Claim 1 of patent US10,282,513B2
-- CUDA enabled - Building CUDA-accelerated version of RELION
-- Setting cpu precision to double
-- Setting accelerated code precision to single
-- Using cuda wrapper to compile....
-- Cuda version is >= 7.5 and single-precision build, enable double usage warning.
-- MPI_INCLUDE_PATH : /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi;/usr/lib/x86_64-linux-gnu/openmpi/include
-- MPI_LIBRARIES : /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- MPI_CXX_INCLUDE_PATH : /usr/lib/x86_64-linux-gnu/openmpi/include/openmpi;/usr/lib/x86_64-linux-gnu/openmpi/include
-- MPI_CXX_LIBRARIES : /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so
-- CMAKE_C_COMPILER : /usr/bin/cc
-- CMAKE_CXX_COMPILER : /usr/bin/c++
-- MPI_C_COMPILER : /usr/bin/mpicc
-- MPI_CXX_COMPILER : /usr/bin/mpicxx
-- CMAKE_CXX_COMPILER_ID : GNU
-- Found previously built non-system FLTK libraries that will be used.
-- FLTK_INCLUDE_DIR: /home/alex/Documents/relion/external/fltk/include
-- FLTK_LIBRARIES:   /home/alex/Documents/relion/external/fltk/lib/libfltk.so
-- Found FFTW
-- FFTW_PATH: /usr/include
-- FFTW_INCLUDES: /usr/include
-- FFTW_LIBRARIES: /usr/lib/x86_64-linux-gnu/libfftw3f.so;/usr/lib/x86_64-linux-gnu/libfftw3.so
BUILD_SHARED_LIBS = OFF
-- Building static libs (larger build size and binaries)
Running apps/CMakeLists.txt...
-- CMAKE_BINARY_DIR:/home/alex/Documents/relion/build
-- Git commit ID: 72bbf0c06cea68f8992328703ee5ae5f3d1fc9b7
-- Configuring done
-- Generating done
-- Build files have been written to: /home/alex/Documents/relion/build

But the make command gets stuck at the same point.

nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/local/cuda-11.6/include/thrust/system/cuda/config.h(107): error: "#" not expected here

/usr/local/cuda-11.6/include/thrust/system/cuda/config.h(107): error: expected a ";"

2 errors detected in the compilation of "/home/alex/Documents/relion/src/acc/cuda/cuda_projector_plan.cu".
CMake Error at relion_gpu_util_generated_cuda_projector_plan.cu.o.Release.cmake:280 (message):
  Error generating file
  /home/alex/Documents/relion/build/src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/./relion_gpu_util_generated_cuda_projector_plan.cu.o

make[2]: *** [src/apps/CMakeFiles/relion_gpu_util.dir/build.make:679: src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_projector_plan.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:675: src/apps/CMakeFiles/relion_gpu_util.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

This is the output when I run it now. Some of the binaries are still compiled, but not all. Any ideas what could be going on?

azazellochg commented 2 years ago

The only thing I could find is https://github.com/NVIDIA/thrust/issues/979

Alexamk commented 2 years ago

Small update: Relion 4.0 compiled succesfully.

n1kt0 commented 2 years ago

@Alexamk how did you fix that as i have the same problem on a manjaro linux installation with cuda 11.6.0 and nvidia driver 510.47.03

Alexamk commented 2 years ago

I didn't. I just decided to install 4.0 instead, which worked just fine.

n1kt0 commented 2 years ago

It works just forgot to switch to ver4.0 branch.

azazellochg commented 2 years ago

Environment:

OS: Debian, kernel 5.16.0-3 OpenMPI 4.1.2 gcc 10.3.0 nvidia driver 470.103.01 nvcc cuda_11.4.r11.4/compiler.30521435_0 RELION version 4.0 ce2e9352da91ad4323a0ebbc00c6796e4b917324 GPU: RTX3090

Make fails with:

[  2%] Building NVCC (Device) object src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_projector_plan.cu.o
In file included from /usr/include/thrust/system/cuda/detail/execution_policy.h:35,
                 from /usr/include/thrust/iterator/detail/device_system_tag.h:23,
                 from /usr/include/thrust/iterator/detail/iterator_facade_category.h:22,
                 from /usr/include/thrust/iterator/iterator_facade.h:37,
                 from /home/gsharov/soft/relion-4.0/src/acc/cuda/cub/device/../iterator/arg_index_input_iterator.cuh:48,
                 from /home/gsharov/soft/relion-4.0/src/acc/cuda/cub/device/device_reduce.cuh:41,
                 from /home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_utils_cub.cuh:18,
                 from /home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_projector_plan.cu:10:
/usr/include/thrust/system/cuda/config.h:79:2: error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.
   79 | #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.

Is whatever relion has in src/acc/cuda/cub/ now incompatible with cub that's now included in CUDA11 toolkit?

biochem-fan commented 2 years ago

As the message suggests, does cmake -DTHRUST_IGNORE_CUB_VERSION_CHECK help?

We cannot drop support for CUDA < 11 yet, so we cannot remove src/acc/cuda/cub. Probably we need to make #include conditional on the CUDA version.

biochem-fan commented 2 years ago

Which package put /usr/include/thrust? Do you really need it there? Usually CUDA SDK is installed in /opt/cuda-XX or /usr/local/cuda-XX and does not contaminate /usr/include.

I successfully compiled RELION 4.0 in:

Ubuntu 20.04 LTS
ICC 2021.5.0 20211109 (from oneAPI SDK)
CUDA SDK 11.6 (Installed in /home/software/packages/cuda-11.6, thus we don't have thrust in /usr/include)

azazellochg commented 2 years ago

/usr/include/thrust/system/cuda/config.h belongs to libthrust-dev. libthrust-dev reverse depends on nvidia-cuda-dev, which is required by nvidia-cuda-toolkit. CUDA toolkit is installe via debian package manager, so nvcc goes to /usr/bin and libs to /usr/lib/x86_64-linux-gnu

cmake -DTHRUST_IGNORE_CUB_VERSION_CHECK does not help

biochem-fan commented 2 years ago

Did you test THRUST_IGNORE_CUB_VERSION_CHECK?

azazellochg commented 2 years ago

If I move away /usr/include/thrust, I get:

[  2%] Building NVCC (Device) object src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/cuda_kernels/relion_gpu_util_generated_helper.cu.o
In file included from /home/gsharov/soft/relion-4.0/src/acc/cuda/cub/device/device_reduce.cuh:41,
from /home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_utils_cub.cuh:18,
from /home/gsharov/soft/relion-4.0/src/acc/cuda/cuda_projector_plan.cu:10:
/home/gsharov/soft/relion-4.0/src/acc/cuda/cub/device/../iterator/arg_index_input_iterator.cuh:44:10: fatal error: thrust/version.h: No such file or directory
44 | #include <thrust/version.h>
|          ^~~~~~~~~~~~~~~~~~
compilation terminated.
CMake Error at relion_gpu_util_generated_cuda_projector_plan.cu.o.Release.cmake:220 (message):
Error generating
/home/gsharov/soft/relion-4.0/test/src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/./relion_gpu_util_generated_cuda_projector_plan.cu.o

make[2]: *** [src/apps/CMakeFiles/relion_gpu_util.dir/build.make:126: src/apps/CMakeFiles/relion_gpu_util.dir/__/acc/cuda/relion_gpu_util_generated_cuda_projector_plan.cu.o] Error 1

biochem-fan commented 2 years ago

Don't move it, but try THRUST_IGNORE_CUB_VERSION_CHECK.

azazellochg commented 2 years ago

After another system update and cmake -DCUDA_ARCH=86 -DFORCE_OWN_FFTW=ON -DAMDFFTW=ON -DCC=gcc-10 -DTHRUST_IGNORE_CUB_VERSION_CHECK=1 .. make worked... I dont understand what has changed, nothing cuda-related was upgraded.

azazellochg commented 2 years ago

Let's hope it will keep compiling in the future, so I dont have to re-open this. Thanks for you help, @biochem-fan

FilipeMaia commented 2 years ago

I believe the original issue was solved by 554e0ed993e5ac8a3fee4be7c5cf64a62216a8c7.

FilipeMaia commented 2 years ago

I also think that in the long-term shipping a version of cub while at the same time using thrust that comes with CUDA and depends on cub is bound to cause unsolvable problems. A possible solution would be to do something like:

#if (__CUDACC_VER_MAJOR__ < 11) || (__CUDACC_VER_MAJOR__ == 11 && __CUDACC_VER_MINOR__ < 2)
// Only use builtin CUB for those CUDA versions that don't bundle it
#include "src/acc/cuda/cub/device/device_radix_sort.cuh"
#include "src/acc/cuda/cub/device/device_reduce.cuh"
#include "src/acc/cuda/cub/device/device_scan.cuh"
#include "src/acc/cuda/cub/device/device_select.cuh"
#else
#include <cub/device/device_radix_sort.cuh>
#include <cub/device/device_reduce.cuh>
#include <cub/device/device_scan.cuh>
#include <cub/device/device_select.cuh>
#endif

in src/acc/cuda/cuda_utils_cub.cuh

3dem / relion

build issue cuda 11.3 #827