rusty1s / pytorch_sparse

PyTorch Extension Library of Optimized Autograd Sparse Matrix Operations
MIT License
1.01k stars 147 forks source link

Segmentation fault when importing torch-sparse (installing pytorch-geometric) #285

Closed alexodavies closed 1 year ago

alexodavies commented 2 years ago

I am trying to install pytorch-geometric for a deep-learning project. Torch-sparse is throwing segmentation faults when I attempt to import it (see below). Initially I tried different versions of each required library, as I thought it might be a GPU issue, but I've since tried to simplify by installing cpu-only versions.

Python 3.9.12 (main, Apr  5 2022, 06:56:58) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
>>>import torch
>>>import torch_scatter
>>>import torch_cluster
>>>import torch_sparse

Segmentation fault (core dumped)

And the same issue, presumably due to torch_sparse, when importing pytorch_geometric:

Python 3.9.12 (main, Apr  5 2022, 06:56:58) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
>>>import torch_geometric

Segmentation fault (core dumped)

I'm on an Ubuntu distribution:

Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:    22.04
Codename:   jammy

Here's my (lightweight for DL) conda installs:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
blas                      1.0                         mkl  
brotlipy                  0.7.0           py310h7f8727e_1002  
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2022.07.19           h06a4308_0  
certifi                   2022.9.24       py310h06a4308_0  
cffi                      1.15.1          py310h74dc2b5_0  
charset-normalizer        2.0.4              pyhd3eb1b0_0  
cpuonly                   2.0                           0    pytorch
cryptography              37.0.1          py310h9ce1e76_0  
fftw                      3.3.9                h27cfd23_1  
idna                      3.4             py310h06a4308_0  
intel-openmp              2021.4.0          h06a4308_3561  
jinja2                    3.0.3              pyhd3eb1b0_0  
joblib                    1.1.1           py310h06a4308_0  
ld_impl_linux-64          2.38                 h1181459_1  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 11.2.0               h1234567_1  
libgfortran-ng            11.2.0               h00389a5_1  
libgfortran5              11.2.0               h1234567_1  
libgomp                   11.2.0               h1234567_1  
libstdcxx-ng              11.2.0               h1234567_1  
libuuid                   1.0.3                h7f8727e_2  
markupsafe                2.1.1           py310h7f8727e_0  
mkl                       2021.4.0           h06a4308_640  
mkl-service               2.4.0           py310h7f8727e_0  
mkl_fft                   1.3.1           py310hd6ae3a3_0  
mkl_random                1.2.2           py310h00e6091_0  
ncurses                   6.3                  h5eee18b_3  
numpy                     1.23.3          py310hd5efca6_0  
numpy-base                1.23.3          py310h8e6c178_0  
openssl                   1.1.1q               h7f8727e_0  
pip                       22.2.2          py310h06a4308_0  
pycparser                 2.21               pyhd3eb1b0_0  
pyg                       2.1.0           py310_torch_1.12.0_cpu    pyg
pyopenssl                 22.0.0             pyhd3eb1b0_0  
pyparsing                 3.0.9           py310h06a4308_0  
pysocks                   1.7.1           py310h06a4308_0  
python                    3.10.6               haa1d7c7_0  
pytorch                   1.12.1             py3.10_cpu_0    pytorch
pytorch-cluster           1.6.0           py310_torch_1.12.0_cpu    pyg
pytorch-mutex             1.0                         cpu    pytorch
pytorch-scatter           2.0.9           py310_torch_1.12.0_cpu    pyg
pytorch-sparse            0.6.15          py310_torch_1.12.0_cpu    pyg
readline                  8.1.2                h7f8727e_1  
requests                  2.28.1          py310h06a4308_0  
scikit-learn              1.1.2           py310h6a678d5_0  
scipy                     1.9.1           py310hd5efca6_0  
setuptools                63.4.1          py310h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
sqlite                    3.39.3               h5082296_0  
threadpoolctl             2.2.0              pyh0d69192_0  
tk                        8.6.12               h1ccaba5_0  
tqdm                      4.64.1          py310h06a4308_0  
typing_extensions         4.3.0           py310h06a4308_0  
tzdata                    2022e                h04d1e81_0  
urllib3                   1.26.12         py310h06a4308_0  
wheel                     0.37.1             pyhd3eb1b0_0  
xz                        5.2.6                h5eee18b_0  
zlib                      1.2.13               h5eee18b_0  

Any help would be greatly appreciated!

rusty1s commented 2 years ago

How did you install torch-sparse? Can you try to check which line in torch_sparse/__init__.py produces the segfault?

WPettersson commented 1 year ago

I'm seeing similar issues under Gentoo with torch installed via pip (pip install torch). I've traced it in torch_sparse/__init__.py to https://github.com/rusty1s/pytorch_sparse/blob/ed2af8e2a074eff603ba2d781fda940191d16e31/torch_sparse/__init__.py#L18 where spec is

ModuleSpec(name='_version_cpu', loader=<_frozen_importlib_external.ExtensionFileLoader object at 0x7f7a5fede560>, origin='/home/user/.virtualenvs/gns/lib/python3.10/site-packages/torch_sparse/_version_cpu.so')

Tracing further shows that it's actually a call to _dlopen in ctypes/__init__.py on opening _version_cpu.so that is causing a segmentation fault. Not sure where to take this debugging next.

pip freeze is

absl-py==1.3.0      
autopep8==2.0.1       
certifi==2022.12.7  
charset-normalizer==2.1.1
contourpy==1.0.6
cycler==0.11.0          
dm-tree==0.1.8  
fonttools==4.38.0                         
idna==3.4                                                                                                                                                                                              
Jinja2==3.1.2
joblib==1.2.0
kiwisolver==1.4.4
MarkupSafe==2.1.1
matplotlib==3.6.2
numpy==1.24.1
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
packaging==22.0
Pillow==9.4.0
psutil==5.9.4
pycodestyle==2.10.0
pyevtk==1.5.0
pyparsing==3.0.9
python-dateutil==2.8.2
requests==2.28.1
scikit-learn==1.2.0
scipy==1.10.0
six==1.16.0
threadpoolctl==3.1.0
tomli==2.0.1
torch==1.13.1
torch-cluster==1.6.0
torch-geometric==2.2.0
torch-scatter==2.1.0
torch-sparse==0.6.16
tqdm==4.64.1
typing_extensions==4.4.0
urllib3==1.26.13
github-actions[bot] commented 1 year ago

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?