About ge-spmm's pytorch-custom

GugaGugaGuga commented 2 years ago

When I run "python3.8 gcn_custom_2layer.py --n-hidden=32", the following situation occurred:

Using /tmp/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch_extensions/spmm/build.ninja...
Building extension module spmm...
[1/3] c++ -MMD -MF spmm.o.d -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm.cpp -o spmm.o
[2/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o
[3/3] c++ spmm.o spmm_kernel.cuda.o -shared -L/usr/local/cuda-10.1/lib64 -lcudart -o spmm.so
Loading extension module spmm...
Traceback (most recent call last):
  File "gcn_custom_2layer.py", line 9, in <module>
    from op import GCNConv
  File "/home/wjy/Documents/ge-spmm-master/pytorch-custom/op.py", line 6, in <module>
    spmm = load(name='spmm', sources=['spmm.cpp', 'spmm_kernel.cu'], verbose=True)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 670, in load
    return _jit_compile(
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 877, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1088, in _import_module_from_library
    return imp.load_module(module_name, file, path, description)
  File "/usr/lib/python3.8/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.8/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /tmp/torch_extensions/spmm/spmm.so: undefined symbol: cusparseCsr2cscEx2

Please help me how to run through next.

hgyhungry commented 2 years ago

It should work together with cudatoolkit v10.1. Try this inside your Python environment conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

GugaGugaGuga commented 2 years ago

It should work together with cudatoolkit v10.1. Try this inside your Python environment conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

I don't have anaconda, I only have python3.8 under ubuntu, all the previous operations ran through and only this sentence is a problem. Can I have any other options?

GugaGugaGuga commented 2 years ago

It should work together with cudatoolkit v10.1. Try this inside your Python environment conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

I don't have anaconda, I only have python3.8 under ubuntu, all the previous operations ran through and only this sentence is a problem. Can I have any other options?

wjy@wjy:~/Documents/ge-spmm-master/pytorch-custom$ python3.8 Python 3.8.12 (default, Sep 10 2021, 00:16:05) [GCC 7.5.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import torch print(torch.version) 1.4.0 torch.cuda.is_available() True
Testing the successful installation, why did this happen when running the gcn_custom_2layer.py file?Please ask for help.

hgyhungry commented 2 years ago

Hi @GugaGugaGuga, the error occurs because cusparse in cuda11 and cuda10 have different APIs, so what matters is CUDA Toolkit version. Can you try print(torch.version.cuda) in python and see if the output is 10.x?

GugaGugaGuga commented 2 years ago

It should work together with cudatoolkit v10.1. Try this inside your Python environment conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

Hi @GugaGugaGuga, the error occurs because cusparse in cuda11 and cuda10 have different APIs, so what matters is CUDA Toolkit version. Can you try print(torch.version.cuda) in python and see if the output is 10.x?

Yes,the output is 10.1. I know cuda11 and cuda10 have different APIs But your version of the code uses cuda10.1.Is there any problem with cuda10.1?

hgyhungry commented 2 years ago

Sorry, the torch.version.cuda does not matter. The compilation of the shared library spmm.so is through your system's default cuda. If your default nvcc is >= 11, there would be a problem. First check if your system cuda is correct version through nvcc --version. To further rule out problems, can you share the output when you execute the script, in particular we need things like

Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/TH -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/guyue/anaconda3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/guyue/ge-spmm/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o

when the shared library is jit-compiled. In my case it is through /usr/local/cuda-10.1/bin/nvcc which works fine.

Note that you may need to clean the compilation cache and run again to see this logging, in your case you need to delete your folder /tmp/torch_extensions/spmm if it's still there.

GugaGugaGuga commented 2 years ago

Sorry, the torch.version.cuda does not matter. The compilation of the shared library spmm.so is through your system's default cuda. If your default nvcc is >= 11, there would be a problem. First check if your system cuda is correct version through nvcc --version. To further rule out problems, can you share the output when you execute the script, in particular we need things like
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/TH -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/guyue/anaconda3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/guyue/ge-spmm/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o
when the shared library is jit-compiled. In my case it is through /usr/local/cuda-10.1/bin/nvcc which works fine.

Note that you may need to clean the compilation cache and run again to see this logging, in your case you need to delete your folder /tmp/torch_extensions/spmm if it's still there.

wjy@wjy:~/Documents/ge-spmm-master/pytorch-custom$ python3.8 gcn_custom.py --n-hidden=32
Using /tmp/torch_extensions as PyTorch extensions root...
Creating extension directory /tmp/torch_extensions/spmm...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch_extensions/spmm/build.ninja...
Building extension module spmm...
[1/3] c++ -MMD -MF spmm.o.d -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm.cpp -o spmm.o
[2/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o
[3/3] c++ spmm.o spmm_kernel.cuda.o -shared -L/usr/local/cuda-10.1/lib64 -lcudart -o spmm.so
Loading extension module spmm...
Traceback (most recent call last):
  File "gcn_custom.py", line 9, in <module>
    from op import GCNConv
  File "/home/wjy/Documents/ge-spmm-master/pytorch-custom/op.py", line 6, in <module>
    spmm = load(name='spmm', sources=['spmm.cpp', 'spmm_kernel.cu'], verbose=True)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 670, in load
    return _jit_compile(
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 877, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1088, in _import_module_from_library
    return imp.load_module(module_name, file, path, description)
  File "/usr/lib/python3.8/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.8/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /tmp/torch_extensions/spmm/spmm.so: undefined symbol: cusparseCsr2cscEx2I was


wjy@wjy:~/Downloads$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

Listening to your answer, I was deleted folder /tmp/torch_extensions/spmm, and is still the case.

hgyhungry commented 2 years ago

Sorry, the torch.version.cuda does not matter. The compilation of the shared library spmm.so is through your system's default cuda. If your default nvcc is >= 11, there would be a problem. First check if your system cuda is correct version through nvcc --version. To further rule out problems, can you share the output when you execute the script, in particular we need things like
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/TH -isystem /home/guyue/anaconda3/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/guyue/anaconda3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/guyue/ge-spmm/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o
when the shared library is jit-compiled. In my case it is through /usr/local/cuda-10.1/bin/nvcc which works fine. Note that you may need to clean the compilation cache and run again to see this logging, in your case you need to delete your folder /tmp/torch_extensions/spmm if it's still there.

wjy@wjy:~/Documents/ge-spmm-master/pytorch-custom$ python3.8 gcn_custom.py --n-hidden=32
Using /tmp/torch_extensions as PyTorch extensions root...
Creating extension directory /tmp/torch_extensions/spmm...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch_extensions/spmm/build.ninja...
Building extension module spmm...
[1/3] c++ -MMD -MF spmm.o.d -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm.cpp -o spmm.o
[2/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o
[3/3] c++ spmm.o spmm_kernel.cuda.o -shared -L/usr/local/cuda-10.1/lib64 -lcudart -o spmm.so
Loading extension module spmm...
Traceback (most recent call last):
  File "gcn_custom.py", line 9, in <module>
    from op import GCNConv
  File "/home/wjy/Documents/ge-spmm-master/pytorch-custom/op.py", line 6, in <module>
    spmm = load(name='spmm', sources=['spmm.cpp', 'spmm_kernel.cu'], verbose=True)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 670, in load
    return _jit_compile(
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 877, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1088, in _import_module_from_library
    return imp.load_module(module_name, file, path, description)
  File "/usr/lib/python3.8/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.8/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /tmp/torch_extensions/spmm/spmm.so: undefined symbol: cusparseCsr2cscEx2I was


wjy@wjy:~/Downloads$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

Listening to your answer, I was deleted folder /tmp/torch_extensions/spmm, and is still the case.

I cannot reproduce the error... Is your LD_LIBRARY_PATH including your /usr/local/cuda-10.1/lib64 ?

GugaGugaGuga commented 2 years ago

export PATH="/usr/local/cuda-10.1/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH"
export CUDA_HOME="/usr/local/cuda-10.1"

Yes

hgyhungry commented 2 years ago

export PATH="/usr/local/cuda-10.1/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-10.1/lib64:$LD_LIBRARY_PATH"
export CUDA_HOME="/usr/local/cuda-10.1"

Yes

Since I cannot reproduce the environment problem, I suggest you use a docker image that I test fine. This is the easiest way. The image pytorch/pytorch:1.4-cuda10.1-cudnn7-devel will work for this repo.

GugaGugaGuga commented 2 years ago

When I run "python3.8 gcn_custom_2layer.py --n-hidden=32", the following situation occurred:

Using /tmp/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch_extensions/spmm/build.ninja...
Building extension module spmm...
[1/3] c++ -MMD -MF spmm.o.d -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm.cpp -o spmm.o
[2/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o
[3/3] c++ spmm.o spmm_kernel.cuda.o -shared -L/usr/local/cuda-10.1/lib64 -lcudart -o spmm.so
Loading extension module spmm...
Traceback (most recent call last):
  File "gcn_custom_2layer.py", line 9, in <module>
    from op import GCNConv
  File "/home/wjy/Documents/ge-spmm-master/pytorch-custom/op.py", line 6, in <module>
    spmm = load(name='spmm', sources=['spmm.cpp', 'spmm_kernel.cu'], verbose=True)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 670, in load
    return _jit_compile(
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 877, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1088, in _import_module_from_library
    return imp.load_module(module_name, file, path, description)
  File "/usr/lib/python3.8/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.8/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /tmp/torch_extensions/spmm/spmm.so: undefined symbol: cusparseCsr2cscEx2

Please help me how to run through next.

Do I need to download cudnn, and if so, what version?

hgyhungry commented 2 years ago

When I run "python3.8 gcn_custom_2layer.py --n-hidden=32", the following situation occurred:

Using /tmp/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch_extensions/spmm/build.ninja...
Building extension module spmm...
[1/3] c++ -MMD -MF spmm.o.d -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm.cpp -o spmm.o
[2/3] /usr/local/cuda-10.1/bin/nvcc -DTORCH_EXTENSION_NAME=spmm -DTORCH_API_INCLUDE_EXTENSION_H -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/TH -isystem /home/wjy/.local/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -std=c++11 -c /home/wjy/Documents/ge-spmm-master/pytorch-custom/spmm_kernel.cu -o spmm_kernel.cuda.o
[3/3] c++ spmm.o spmm_kernel.cuda.o -shared -L/usr/local/cuda-10.1/lib64 -lcudart -o spmm.so
Loading extension module spmm...
Traceback (most recent call last):
  File "gcn_custom_2layer.py", line 9, in <module>
    from op import GCNConv
  File "/home/wjy/Documents/ge-spmm-master/pytorch-custom/op.py", line 6, in <module>
    spmm = load(name='spmm', sources=['spmm.cpp', 'spmm_kernel.cu'], verbose=True)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 670, in load
    return _jit_compile(
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 877, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/home/wjy/.local/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1088, in _import_module_from_library
    return imp.load_module(module_name, file, path, description)
  File "/usr/lib/python3.8/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.8/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: /tmp/torch_extensions/spmm/spmm.so: undefined symbol: cusparseCsr2cscEx2

Please help me how to run through next.

Do I need to download cudnn, and if so, what version?

The code does not depend on cudnn, only cusparse, which comes together with cuda toolkit (we need version <= 10.1). Again I suggest using docker to solve environment problem, and pytorch/pytorch:1.4-cuda10.1-cudnn7-devel image should work.

hgyhungry / ge-spmm

About ge-spmm's pytorch-custom #10