3dem / relion

Image-processing software for cryo-electron microscopy
https://relion.readthedocs.io/en/latest/
GNU General Public License v2.0
450 stars 201 forks source link

Install problem related to blushing #1064

Open unaginekomoyashi opened 9 months ago

unaginekomoyashi commented 9 months ago

When installing relion5, I ran cmake as shown below and built it using make. The build was going well, but when I turned on blushing in refine, the following error occurred. I have confirmed the cudnn path with cuda12.3.The ~/.tcshrc settings and cmake results are shown for reference. It would be helpful if we could get comments from experts. Thank you.

========== relion error log Authorization required, but no authorization protocol specified Authorization required, but no authorization protocol specified Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory /data1/softwares/relion/build/bin/relion_python_blush: line 36: 354460 Aborted (core dumped) TORCH_HOME="$torch_home" "$python_executable" -c "from relion_blush import main; exit(main())" "$@"

================= this shows ~/.tcshrc setenv PATH /usr/local/cuda-12.3/bin:/usr/lib64/openmpi/bin:/usr/lib64:/usr/bin:/data1/softwares/relion/build/bin:/usr/local/miniconda3/envs/relion-5.0/bin/python: setenv CUDA_HOME /usr/local/cuda-12.3 setenv CUDA_ROOT /usr/local/cuda-12.3 setenv LD_LIBRARY_PATH /usr:/usr/local/lib:/usr/local/cuda-12.3/lib64:/usr/loca/cuda/lib64:/usr/local/miniconda3/envs/relion-5.0/lib/python3.10/site-packages/nvidia/cudnn/lib:/usr/local/cuda-12.3/targets/x86_64-linux/lib/stubs/libcuda.so:/usr/lib/aarch64-linux-gnu:${LD_LIBRARY_PATH}

==================================================== cmake command and reusult cmake .. -DPYTHON_EXE_PATH=/usr/local/miniconda3/envs/relion-5.0/bin/python3.10 -DTORCH_HOME_PATH=/data1/softwares/relion/build/relion_torch -DFETCH_WEIGHTS=OFF -DCMAKE_C_COMPILER=/usr/bin/mpicc -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-12.3 -DFETCH_WEIGHTS=OFF CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required): Compatibility with CMake < 2.8.12 will be removed from a future version of CMake.

Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions.

-- BUILD TYPE set to the default type: 'Release' -- Setting fallback CUDA_ARCH=50 -- CUDA enabled - Building CUDA-accelerated version of RELION -- Setting cpu precision to double -- Setting accelerated code precision to single CMake Warning (dev) at CMakeLists.txt:192 (FIND_PACKAGE): Policy CMP0074 is not set: find_package uses _ROOT variables. Run "cmake --help-policy CMP0074" for policy details. Use the cmake_policy command to set the policy and suppress this warning.

Environment variable CUDA_ROOT is set to:

/usr/local/cuda-12.3

For compatibility, CMake is ignoring the variable. This warning is for project developers. Use -Wno-dev to suppress it.

-- Using cuda wrapper to compile.... -- Cuda version is >= 7.5 and single-precision build, enable double usage warning. -- MPI_INCLUDE_PATH : /usr/lib/x86_64-linux-gnu/openmpi/include;/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -- MPI_LIBRARIES : /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so -- MPI_CXX_INCLUDE_PATH : /usr/lib/x86_64-linux-gnu/openmpi/include;/usr/lib/x86_64-linux-gnu/openmpi/include/openmpi -- MPI_CXX_LIBRARIES : /usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi_cxx.so;/usr/lib/x86_64-linux-gnu/openmpi/lib/libmpi.so -- CMAKE_C_COMPILER : /usr/bin/mpicc -- CMAKE_CXX_COMPILER : /usr/bin/c++ -- MPI_C_COMPILER : /usr/bin/mpicc -- MPI_CXX_COMPILER : /usr/bin/mpicxx -- CMAKE_CXX_COMPILER_ID : GNU -- Could NOT find FLTK (missing: FLTK_LIBRARIES) -- No FLTK installation was found -- Found previously built non-system FLTK libraries that will be used. -- FLTK_INCLUDE_DIR: /data1/softwares/relion/external/fltk/include -- FLTK_LIBRARIES: /data1/softwares/relion/external/fltk/lib/libfltk.so -- Found FFTW -- FFTW_PATH: /usr/include -- FFTW_INCLUDES: /usr/include -- FFTW_LIBRARIES: /usr/lib/x86_64-linux-gnu/libfftw3f.so;/usr/lib/x86_64-linux-gnu/libfftw3.so -- Using Python executable: /usr/local/miniconda3/envs/relion-5.0/bin/python3.10 -- Will try to find Torch home directory... -- Using Torch home: /data1/softwares/relion/build/relion_torch -- Omitting download of model weights for dependent packages -- Building static libs (larger build size and binaries) Running apps/CMakeLists.txt... -- CMAKE_BINARY_DIR:/data1/softwares/relion/build -- Git commit ID: 90d239eda870d64834e642aaa19ce3e6d0ab5ab0 -- CMAKE_C_FLAGS : -std=c99 -fopenmp -- CMAKE_CXX_FLAGS : -fPIC -std=c++14 -fopenmp -- Configuring done -- Generating done -- Build files have been written to: /data1/softwares/relion/build

biochem-fan commented 9 months ago

Usually CUDNN is provided by the relion-5.0 conda environment (/usr/local/miniconda3/envs/relion-5.0/bin/python3.10). I'm not sure if using LD_LIBRARY_PATH is a good idea. I also don't know if conda supports tcsh.

unaginekomoyashi commented 9 months ago

I would like to downgrade and try it on a different version of CUDA as I compared it with a PC that is already installed and behaving without problems in a csh environment. Thank you for your comment.

unaginekomoyashi commented 9 months ago

The addition of "usr/local/cuda-12.2/cudnn-8.9.5.29/lib" to LD_LIBRARY_PATH in the tcsh environment successfully resolved the error, independent of the CUDA version.