bodono / scs-python

Python interface for SCS
MIT License
41 stars 33 forks source link

"undefined symbol: cusparseDcsrmv" error when building from source with GPU support in conda environment #23

Closed GillotP closed 2 years ago

GillotP commented 4 years ago

Hi !

First, thank you for the cvxpy project which is very convenient to use. I've been playing with the SCS solver in order to solve a series of successive quadratic problems, which works pretty well. I wanted to go a step further and try the GPU support for my use case, but have not been able to set it up despite having read about a lot of similar problems in the general SCS issues section. My problem is specifically related to the Python interface, I can't seem to link properly CUDA and SCS together.

First, my system is the following:

System: Ubuntu 20.04.1 64-bit Kernel Linux 5.4.0-48-generic x86_64

Hardware: Memory: 15.5GiB Processor: Intel® Core™ i7-9750H CPU @ 2.60GHz × 12 Graphics: GeForce RTX 2060

My CUDA installation is the cuda toolkit 11.0 (system wide installation, installed from the debs following nvidia's instructions). I added the exports in my ~/.bashrc:

export CUDA_HOME=/usr/local/cuda-11.0 export PATH=${CUDA_HOME}/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=${CUDA_HOME}/lib64\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

When I check my environment variables in the terminal, I get:

echo $PATH /usr/local/cuda-11.0/bin:...

echo $LD_LIBRARY_PATH /usr/local/cuda-11.0/lib64

echo $CUDA_HOME /usr/local/cuda-11.0

nvcc is properly recognised as well:

which nvcc /usr/local/cuda-11.0/bin/nvcc

Now, I'm trying to create a conda environment and to build SCS from source with GPU support inside that environment, then install CVXPY on top. I followed these exact steps:

" conda create --name test_scs conda activate test_scs conda install -c conda-forge numpy scipy conda update --all python setup.py install --scs --gpu --int "

Note that after updating the conda environment, BLAS and Lapack are in MKL. The output of the SCS installation is given below:

" Namespace(blas64=False, extraverbose=False, float32=False, gpu=True, int32=True, scs=True) running install running bdist_egg running egg_info writing scs.egg-info/PKG-INFO writing dependency_links to scs.egg-info/dependency_links.txt writing requirements to scs.egg-info/requires.txt writing top-level names to scs.egg-info/top_level.txt blas_mkl_info: libraries = ['mkl_rt', 'pthread'] library_dirs = ['/home/pierre/miniconda3/envs/test_scs/lib'] define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)] include_dirs = ['/home/pierre/miniconda3/envs/test_scs/include'] blas_opt_info: libraries = ['mkl_rt', 'pthread'] library_dirs = ['/home/pierre/miniconda3/envs/test_scs/lib'] define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)] include_dirs = ['/home/pierre/miniconda3/envs/test_scs/include'] lapack_mkl_info: libraries = ['mkl_rt', 'pthread'] library_dirs = ['/home/pierre/miniconda3/envs/test_scs/lib'] define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)] include_dirs = ['/home/pierre/miniconda3/envs/test_scs/include'] lapack_opt_info: libraries = ['mkl_rt', 'pthread'] library_dirs = ['/home/pierre/miniconda3/envs/test_scs/lib'] define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)] include_dirs = ['/home/pierre/miniconda3/envs/test_scs/include'] {'libraries': ['mkl_rt', 'pthread'], 'library_dirs': ['/home/pierre/miniconda3/envs/test_scs/lib'], 'define_macros': [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)], 'include_dirs': ['/home/pierre/miniconda3/envs/test_scs/include']} {'libraries': ['mkl_rt', 'pthread'], 'library_dirs': ['/home/pierre/miniconda3/envs/test_scs/lib'], 'define_macros': [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)], 'include_dirs': ['/home/pierre/miniconda3/envs/test_scs/include']} Adding scs 2.1.2 to easy-install.pth file

Installed /home/pierre/miniconda3/envs/test_scs/lib/python3.8/site-packages/scs-2.1.2-py3.8-linux-x86_64.egg Processing dependencies for scs==2.1.2 Searching for scipy==1.5.2 Best match: scipy 1.5.2 Adding scipy 1.5.2 to easy-install.pth file

Using /home/pierre/miniconda3/envs/test_scs/lib/python3.8/site-packages Searching for numpy==1.19.1 Best match: numpy 1.19.1 Adding numpy 1.19.1 to easy-install.pth file Installing f2py script to /home/pierre/miniconda3/envs/test_scs/bin Installing f2py3 script to /home/pierre/miniconda3/envs/test_scs/bin Installing f2py3.8 script to /home/pierre/miniconda3/envs/test_scs/bin

Using /home/pierre/miniconda3/envs/test_scs/lib/python3.8/site-packages Finished processing dependencies for scs==2.1.2 "

When I try importing _scs_gpu with "import _scs_gpu", it fails with an "undefined symbol error":

" Traceback (most recent call last): File "compat_test.py", line 1, in import _scs_gpu ImportError: /home/pierre/miniconda3/envs/test_scs/lib/python3.8/site-packages/scs-2.1.2-py3.8-linux-x86_64.egg/_scs_gpu.cpython-38-x86_64-linux-gnu.so: undefined symbol: cusparseDcsrmv "

I've tried building on top of CUDA 10.2 as well and obtained the exact same error. SCS itself seems to be properly installed: I could install CVXPY on top with "pip install cvxpy", then SCS works fine with "solve(solver=SCS, gpu=False)". With "solve(solver=SCS, use_indirect=True, gpu=True)", it fails when trying to import _scs_gpu as well (exact same error as above).

I must admit that I'm not a pro when it comes to building from source and I may have misunderstood something rather obvious, in which case I'm really sorry. If you have any idea about what could be the cause of the problem I'm facing, that would help a lot !

Thank you in advance, Pierre

bodono commented 4 years ago

Hi Pierre, thanks for such a detailed bug report. First let's try to run it outside of python. Since you can build SCS gpu from source you can run the gpu test by running inside the scs directory

make purge
make gpu
out/demo_socp_gpu_indirect 50

If that can't find the right libraries, then you can run

locate libcublas

Then pick one of the directories (the one that looks like it contains a working, correct cuda library install) and run

export LD_LIBRARY_PATH=/path/to/that/directory

Then try running it again.

If this works then it's something to do with python and we can debug from there.

bodono commented 4 years ago

You should also make sure you have downloaded and installed cusparse (https://developer.nvidia.com/cusparse). If you have, then you can also check to make sure that symbol actually exists in the cusparse library. If you navigate to the directory containing libcusparse (same dir as libcublas etc), then run

nm -gD libcusparse.so | grep csrmv

you should see it. Here's what I see in my linux box:

000000000017ed30 T cusparseCcsrmv
00000000001cbab0 T cusparseCcsrmv_hyb
0000000000083f70 T cusparseCcsrmv_mp
0000000000049640 T cusparseCcsrmv_v2
000000000018bbc0 T cusparseDcsrmv
00000000001cc9a0 T cusparseDcsrmv_hyb
0000000000083f90 T cusparseDcsrmv_mp
0000000000049650 T cusparseDcsrmv_v2
000000000017ed20 T cusparseScsrmv
00000000001cbb80 T cusparseScsrmv_hyb
0000000000083f80 T cusparseScsrmv_mp
0000000000049660 T cusparseScsrmv_v2
000000000018bbd0 T cusparseZcsrmv
00000000001cc8d0 T cusparseZcsrmv_hyb
0000000000083f60 T cusparseZcsrmv_mp
0000000000049630 T cusparseZcsrmv_v2
000000000006c4b0 T sparseCcsrmv
000000000006c490 T sparseDcsrmv
000000000006c4a0 T sparseScsrmv
000000000006c480 T sparseZcsrmv

in particular cusparseDcsrmv is present.

h-vetinari commented 3 years ago

I'm trying to add GPU builds for scs in conda-forge: https://github.com/conda-forge/scs-feedstock/pull/21

Help / feedback welcome. :)

shuaihuachen commented 2 years ago

Hi! I run nm -gD libcusparse.so | grep csrmv in the directory containing libcusparse but got nothing. What should I do to fix this?

You should also make sure you have downloaded and installed cusparse (https://developer.nvidia.com/cusparse). If you have, then you can also check to make sure that symbol actually exists in the cusparse library. If you navigate to the directory containing libcusparse (same dir as libcublas etc), then run

nm -gD libcusparse.so | grep csrmv

you should see it. Here's what I see in my linux box:

000000000017ed30 T cusparseCcsrmv
00000000001cbab0 T cusparseCcsrmv_hyb
0000000000083f70 T cusparseCcsrmv_mp
0000000000049640 T cusparseCcsrmv_v2
000000000018bbc0 T cusparseDcsrmv
00000000001cc9a0 T cusparseDcsrmv_hyb
0000000000083f90 T cusparseDcsrmv_mp
0000000000049650 T cusparseDcsrmv_v2
000000000017ed20 T cusparseScsrmv
00000000001cbb80 T cusparseScsrmv_hyb
0000000000083f80 T cusparseScsrmv_mp
0000000000049660 T cusparseScsrmv_v2
000000000018bbd0 T cusparseZcsrmv
00000000001cc8d0 T cusparseZcsrmv_hyb
0000000000083f60 T cusparseZcsrmv_mp
0000000000049630 T cusparseZcsrmv_v2
000000000006c4b0 T sparseCcsrmv
000000000006c490 T sparseDcsrmv
000000000006c4a0 T sparseScsrmv
000000000006c480 T sparseZcsrmv

in particular cusparseDcsrmv is present.

bodono commented 2 years ago

In later versions of CUDA they have changed the API, and the latest version of SCS requires the new CUDA. You can check if your libcusparse is up to date now with:

nm -gD libcusparse.so.11 | grep SpMV
0000000000046830 T cusparseSpMV@@libcusparse.so.11
00000000000462f0 T cusparseSpMV_bufferSize@@libcusparse.so.11

IN particular SCS requires both cusparseSpMV and cusparseSpMV_bufferSize.