facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.7k stars 1.3k forks source link

RuntimeError: Not compiled with GPU support when running knn in a process launched on slurm #1730

Closed IMasI2Cat closed 7 months ago

IMasI2Cat commented 7 months ago

I first had a problem installing pytorch3d with conda (incompatibility issues that could not be solved, I followed the steps in the README to install all requirements) so I manually downloaded the wheel from anaconda (matching my Python, Pytorch and CUDA versions, in my case I ran wget https://anaconda.org/pytorch3d/pytorch3d/0.7.4/download/linux-64/pytorch3d-0.7.4-py310_cu116_pyt1130.tar.bz2 ) and then installed with conda install pytorch3d-0.7.4-py310_cu116_pyt1130.tar.bz2, which seemed to work (and I could correctly import pytorch3d and submodules of it). However, when I run a code (via slurm, which is the only way to run it on a GPU in the server I am working on) I get the following error at calling _pytorch3d.loss.chamferdistance :

Traceback (most recent call last): File "/home/imas/code/Repos/KAIR_32/models/loss.py", line 237, in forward distx, disty = chamfer_distance(pcx, pcy) File "/home/imas/miniconda3/envs/pytorch3d/lib/python3.10/site-packages/pytorch3d/loss/chamfer.py", line 231, in chamfer_distance cham_x, cham_norm_x = _chamfer_distance_single_direction( File "/home/imas/miniconda3/envs/pytorch3d/lib/python3.10/site-packages/pytorch3d/loss/chamfer.py", line 113, in _chamfer_distance_single_direction x_nn = knn_points(x, y, lengths1=x_lengths, lengths2=y_lengths, norm=norm, K=1) File "/home/imas/miniconda3/envs/pytorch3d/lib/python3.10/site-packages/pytorch3d/ops/knn.py", line 187, in knn_points p1_dists, p1_idx = _knn_points.apply( File "/home/imas/miniconda3/envs/pytorch3d/lib/python3.10/site-packages/pytorch3d/ops/knn.py", line 72, in forward idx, dists = _C.knn_points_idx(p1, p2, lengths1, lengths2, norm, K, version) RuntimeError: Not compiled with GPU support.

bottler commented 7 months ago

Please share the output of conda list and pip list. I think you might have more than one pytorch3d in your environment.

renyu2016 commented 7 months ago

Hi, I fall into exactly the same state. My pip list: Package Version Editable project location


absl-py 2.0.0 addict 2.4.0 aiosignal 1.3.1 ansi2html 1.9.1 asttokens 2.4.1 attrs 23.1.0 backcall 0.2.0 blinker 1.7.0 cachetools 5.3.2 certifi 2023.11.17 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 comm 0.2.0 ConfigArgParse 1.7 contourpy 1.1.1 cycler 0.12.1 dash 2.14.2 dash-core-components 2.0.0 dash-html-components 2.0.0 dash-table 5.0.0 decorator 5.1.1 dominate 2.6.0 einops 0.7.0 executing 2.0.1 fastjsonschema 2.19.1 filelock 3.13.1 Flask 3.0.0 fonttools 4.47.0 frozenlist 1.4.1 fsspec 2023.12.2 fvcore 0.1.5.post20221221 google-auth 2.25.2 google-auth-oauthlib 1.0.0 grpcio 1.60.0 gym 0.26.2 gym-notices 0.0.8 h5py 3.10.0 idna 3.6 imageio 2.33.1 importlib-metadata 7.0.1 importlib-resources 6.1.1 iopath 0.1.10 ipython 8.12.3 ipywidgets 8.1.1 isaacgym 1.0rc4 /data/ubuntu_data/isaacgym/python itsdangerous 2.1.2 jedi 0.19.1 Jinja2 3.1.2 joblib 1.3.2 jsonpatch 1.33 jsonpointer 2.4 jsonschema 4.20.0 jsonschema-specifications 2023.12.1 jupyter_core 5.5.1 jupyterlab-widgets 3.0.9 kiwisolver 1.4.5 Markdown 3.5.1 MarkupSafe 2.1.3 matplotlib 3.7.4 matplotlib-inline 0.1.6 MouseInfo 0.1.3 mpmath 1.3.0 msgpack 1.0.7 nbformat 5.7.0 nest-asyncio 1.5.8 networkx 3.1 ninja 1.11.1.1 numpy 1.23.0 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.18.1 nvidia-nvjitlink-cu12 12.3.101 nvidia-nvtx-cu12 12.1.105 oauthlib 3.2.2 open3d 0.17.0 opencv-python 4.8.1.78 packaging 23.2 pandas 2.0.3 parso 0.8.3 pexpect 4.9.0 pickleshare 0.7.5 Pillow 10.1.0 pip 23.3.1 pkgutil_resolve_name 1.3.10 platformdirs 4.1.0 plotly 5.18.0 portalocker 2.8.2 prompt-toolkit 3.0.43 protobuf 4.25.1 psutil 5.9.7 ptyprocess 0.7.0 pure-eval 0.2.2 pyasn1 0.5.1 pyasn1-modules 0.3.0 PyAutoGUI 0.9.54 PyGetWindow 0.0.9 Pygments 2.17.2 PyMsgBox 1.0.9 pyparsing 3.1.1 pyperclip 1.8.2 pyquaternion 0.9.9 PyRect 0.2.0 PyScreeze 0.1.30 python-dateutil 2.8.2 python3-xlib 0.15 pytorch3d 0.7.6 pytweening 1.0.7 pytz 2023.3.post1 PyYAML 6.0.1 ray 2.9.0 referencing 0.32.0 requests 2.31.0 requests-oauthlib 1.3.1 retrying 1.3.4 rl-games 1.5.2 rpds-py 0.16.2 rsa 4.9 scikit-learn 1.3.2 scipy 1.10.1 setproctitle 1.3.3 setuptools 58.0.4 six 1.16.0 stack-data 0.6.3 sympy 1.12 tabulate 0.9.0 tenacity 8.2.3 tensorboard 2.14.0 tensorboard-data-server 0.7.2 tensorboardX 2.6.2.2 termcolor 2.4.0 threadpoolctl 3.2.0 torch 1.9.0+cu111 torchaudio 0.9.0 torchvision 0.10.0+cu111 tornado 6.4 tqdm 4.66.1 traitlets 5.14.0 triton 2.1.0 typing_extensions 4.9.0 tzdata 2023.3 urllib3 2.1.0 visdom 0.2.4 wcwidth 0.2.12 websocket-client 1.7.0 Werkzeug 3.0.1 wheel 0.41.2 widgetsnbextension 4.0.9 yacs 0.1.8 zipp 3.17.0

my conda list:

Name Version Build Channel

_libgcc_mutex 0.1 main https://repo.anaconda.com/pkgs/main _openmp_mutex 5.1 1_gnu https://repo.anaconda.com/pkgs/main absl-py 2.0.0 pypi_0 pypi addict 2.4.0 pypi_0 pypi aiosignal 1.3.1 pypi_0 pypi ansi2html 1.9.1 pypi_0 pypi asttokens 2.4.1 pypi_0 pypi attrs 23.1.0 pypi_0 pypi backcall 0.2.0 pypi_0 pypi blinker 1.7.0 pypi_0 pypi ca-certificates 2023.12.12 h06a4308_0 https://repo.anaconda.com/pkgs/main cachetools 5.3.2 pypi_0 pypi certifi 2023.11.17 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi click 8.1.7 pypi_0 pypi cloudpickle 3.0.0 pypi_0 pypi comm 0.2.0 pypi_0 pypi configargparse 1.7 pypi_0 pypi contourpy 1.1.1 pypi_0 pypi cycler 0.12.1 pypi_0 pypi dash 2.14.2 pypi_0 pypi dash-core-components 2.0.0 pypi_0 pypi dash-html-components 2.0.0 pypi_0 pypi dash-table 5.0.0 pypi_0 pypi decorator 5.1.1 pypi_0 pypi dominate 2.6.0 pypi_0 pypi einops 0.7.0 pypi_0 pypi executing 2.0.1 pypi_0 pypi fastjsonschema 2.19.1 pypi_0 pypi filelock 3.13.1 pypi_0 pypi flask 3.0.0 pypi_0 pypi fonttools 4.47.0 pypi_0 pypi frozenlist 1.4.1 pypi_0 pypi fsspec 2023.12.2 pypi_0 pypi fvcore 0.1.5.post20221221 pypi_0 pypi google-auth 2.25.2 pypi_0 pypi google-auth-oauthlib 1.0.0 pypi_0 pypi grpcio 1.60.0 pypi_0 pypi gym 0.26.2 pypi_0 pypi gym-notices 0.0.8 pypi_0 pypi h5py 3.10.0 pypi_0 pypi idna 3.6 pypi_0 pypi imageio 2.33.1 pypi_0 pypi importlib-metadata 7.0.1 pypi_0 pypi importlib-resources 6.1.1 pypi_0 pypi iopath 0.1.10 pypi_0 pypi ipython 8.12.3 pypi_0 pypi ipywidgets 8.1.1 pypi_0 pypi isaacgym 1.0rc4 dev_0 itsdangerous 2.1.2 pypi_0 pypi jedi 0.19.1 pypi_0 pypi jinja2 3.1.2 pypi_0 pypi joblib 1.3.2 pypi_0 pypi jsonpatch 1.33 pypi_0 pypi jsonpointer 2.4 pypi_0 pypi jsonschema 4.20.0 pypi_0 pypi jsonschema-specifications 2023.12.1 pypi_0 pypi jupyter-core 5.5.1 pypi_0 pypi jupyterlab-widgets 3.0.9 pypi_0 pypi kiwisolver 1.4.5 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 https://repo.anaconda.com/pkgs/main libffi 3.4.4 h6a678d5_0 https://repo.anaconda.com/pkgs/main libgcc-ng 11.2.0 h1234567_1 https://repo.anaconda.com/pkgs/main libgomp 11.2.0 h1234567_1 https://repo.anaconda.com/pkgs/main libstdcxx-ng 11.2.0 h1234567_1 https://repo.anaconda.com/pkgs/main markdown 3.5.1 pypi_0 pypi markupsafe 2.1.3 pypi_0 pypi matplotlib 3.7.4 pypi_0 pypi matplotlib-inline 0.1.6 pypi_0 pypi mouseinfo 0.1.3 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi msgpack 1.0.7 pypi_0 pypi nbformat 5.7.0 pypi_0 pypi ncurses 6.4 h6a678d5_0 https://repo.anaconda.com/pkgs/main nest-asyncio 1.5.8 pypi_0 pypi networkx 3.1 pypi_0 pypi ninja 1.11.1.1 pypi_0 pypi numpy 1.23.0 pypi_0 pypi nvidia-cublas-cu12 12.1.3.1 pypi_0 pypi nvidia-cuda-cupti-cu12 12.1.105 pypi_0 pypi nvidia-cuda-nvrtc-cu12 12.1.105 pypi_0 pypi nvidia-cuda-runtime-cu12 12.1.105 pypi_0 pypi nvidia-cudnn-cu12 8.9.2.26 pypi_0 pypi nvidia-cufft-cu12 11.0.2.54 pypi_0 pypi nvidia-curand-cu12 10.3.2.106 pypi_0 pypi nvidia-cusolver-cu12 11.4.5.107 pypi_0 pypi nvidia-cusparse-cu12 12.1.0.106 pypi_0 pypi nvidia-nccl-cu12 2.18.1 pypi_0 pypi nvidia-nvjitlink-cu12 12.3.101 pypi_0 pypi nvidia-nvtx-cu12 12.1.105 pypi_0 pypi oauthlib 3.2.2 pypi_0 pypi open3d 0.17.0 pypi_0 pypi opencv-python 4.8.1.78 pypi_0 pypi openssl 3.0.12 h7f8727e_0 https://repo.anaconda.com/pkgs/main packaging 23.2 pypi_0 pypi pandas 2.0.3 pypi_0 pypi parso 0.8.3 pypi_0 pypi pexpect 4.9.0 pypi_0 pypi pickleshare 0.7.5 pypi_0 pypi pillow 10.1.0 pypi_0 pypi pip 23.3.1 py38h06a4308_0 https://repo.anaconda.com/pkgs/main pkgutil-resolve-name 1.3.10 pypi_0 pypi platformdirs 4.1.0 pypi_0 pypi plotly 5.18.0 pypi_0 pypi portalocker 2.8.2 pypi_0 pypi prompt-toolkit 3.0.43 pypi_0 pypi protobuf 4.25.1 pypi_0 pypi psutil 5.9.7 pypi_0 pypi ptyprocess 0.7.0 pypi_0 pypi pure-eval 0.2.2 pypi_0 pypi pyasn1 0.5.1 pypi_0 pypi pyasn1-modules 0.3.0 pypi_0 pypi pyautogui 0.9.54 pypi_0 pypi pygetwindow 0.0.9 pypi_0 pypi pygments 2.17.2 pypi_0 pypi pymsgbox 1.0.9 pypi_0 pypi pyparsing 3.1.1 pypi_0 pypi pyperclip 1.8.2 pypi_0 pypi pyquaternion 0.9.9 pypi_0 pypi pyrect 0.2.0 pypi_0 pypi pyscreeze 0.1.30 pypi_0 pypi python 3.8.18 h955ad1f_0 https://repo.anaconda.com/pkgs/main python-dateutil 2.8.2 pypi_0 pypi python3-xlib 0.15 pypi_0 pypi pytorch3d 0.7.6 pypi_0 pypi pytweening 1.0.7 pypi_0 pypi pytz 2023.3.post1 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi ray 2.9.0 pypi_0 pypi readline 8.2 h5eee18b_0 https://repo.anaconda.com/pkgs/main referencing 0.32.0 pypi_0 pypi requests 2.31.0 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi retrying 1.3.4 pypi_0 pypi rl-games 1.5.2 pypi_0 pypi rpds-py 0.16.2 pypi_0 pypi rsa 4.9 pypi_0 pypi scikit-learn 1.3.2 pypi_0 pypi scipy 1.10.1 pypi_0 pypi setproctitle 1.3.3 pypi_0 pypi setuptools 58.0.4 pypi_0 pypi six 1.16.0 pypi_0 pypi sqlite 3.41.2 h5eee18b_0 https://repo.anaconda.com/pkgs/main stack-data 0.6.3 pypi_0 pypi sympy 1.12 pypi_0 pypi tabulate 0.9.0 pypi_0 pypi tenacity 8.2.3 pypi_0 pypi tensorboard 2.14.0 pypi_0 pypi tensorboard-data-server 0.7.2 pypi_0 pypi tensorboardx 2.6.2.2 pypi_0 pypi termcolor 2.4.0 pypi_0 pypi threadpoolctl 3.2.0 pypi_0 pypi tk 8.6.12 h1ccaba5_0 https://repo.anaconda.com/pkgs/main torch 1.9.0+cu111 pypi_0 pypi torchaudio 0.9.0 pypi_0 pypi torchvision 0.10.0+cu111 pypi_0 pypi tornado 6.4 pypi_0 pypi tqdm 4.66.1 pypi_0 pypi traitlets 5.14.0 pypi_0 pypi triton 2.1.0 pypi_0 pypi typing-extensions 4.9.0 pypi_0 pypi tzdata 2023.3 pypi_0 pypi urllib3 2.1.0 pypi_0 pypi visdom 0.2.4 pypi_0 pypi wcwidth 0.2.12 pypi_0 pypi websocket-client 1.7.0 pypi_0 pypi werkzeug 3.0.1 pypi_0 pypi wheel 0.41.2 py38h06a4308_0 https://repo.anaconda.com/pkgs/main widgetsnbextension 4.0.9 pypi_0 pypi xz 5.4.5 h5eee18b_0 https://repo.anaconda.com/pkgs/main yacs 0.1.8 pypi_0 pypi zipp 3.17.0 pypi_0 pypi zlib 1.2.13 h5eee18b_0 https://repo.anaconda.com/pkgs/main

IMasI2Cat commented 7 months ago

Sorry, I forgot to answer this, but I solved it by running all the steps with srun. Should I close the issue?

Russ-Yan commented 5 months ago

Same issue, cloud you tell how to solve this ? THX!