nv-tlabs / NKSR

[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction
https://research.nvidia.com/labs/toronto-ai/NKSR
Other
735 stars 43 forks source link

An error occurred while using the server #70

Closed Mr-ZhuJun closed 5 months ago

Mr-ZhuJun commented 6 months ago

Installation instructions:

  1. git clone https://github.com/nv-tlabs/NKSR.git
  2. cd NKSR
  3. conda env create
  4. conda activate nksr
  5. pip install nksr -f https://nksr.huangjh.tech/whl/torch-2.0.0+cu118.html

nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:33:58_PDT_2022 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0

nvidia-smi Fri Mar 1 18:29:48 2024
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:3D:00.0 Off | N/A | | 31% 31C P8 3W / 250W | 0MiB / 11264MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

Run the example:

python examples/recons_simple.py

Traceback (most recent call last): File "/data/code/NKSR/examples/recons_simple.py", line 11, in import nksr File "/data/path/conda/envs/nksr/lib/python3.10/site-packages/nksr/init.py", line 18, in from nksr.nn.unet import SparseStructureNet File "/data/path/conda/envs/nksr/lib/python3.10/site-packages/nksr/nn/init.py", line 10, in from .modules import Conv3d, GroupNorm, Activation, GroupNorm, MaxPooling, Upsampling, SparseZeroPadding File "/data/path/conda/envs/nksr/lib/python3.10/site-packages/nksr/nn/modules.py", line 14, in from nksr.svh import SparseFeatureHierarchy, KernelMap, VoxelStatus File "/data/path/conda/envs/nksr/lib/python3.10/site-packages/nksr/svh.py", line 12, in import torch_scatter File "/data/path/conda/envs/nksr/lib/python3.10/site-packages/torch_scatter/init.py", line 16, in torch.ops.load_library(spec.origin) File "/data/path/conda/envs/nksr/lib/python3.10/site-packages/torch/_ops.py", line 643, in load_library ctypes.CDLL(path) File "/data/path/conda/envs/nksr/lib/python3.10/ctypes/init.py", line 374, in init self._handle = _dlopen(self._name, mode) OSError: libc10_cuda.so: cannot open shared object file: No such file or directory

Eberty commented 5 months ago

The problem is on:

pip install nksr -f https://nksr.huangjh.tech/whl/torch-2.0.0+cu118.html

Please, run

python -c "import torch; print(torch.__version__)"

in your environment

in my case, the answer was 2.2.1+cu121

So, I set on terminal TORCH_VERSION=2.2.1 and CUDA_VERSION=cu121

and run

pip install -U nksr -f https://nksr.huangjh.tech/whl/torch-${TORCH_VERSION}+${CUDA_VERSION}.html

with correct versions and worked for me

andrewcaunes commented 5 months ago

Neither the standard instructions nore this answer worked for me but the following worked :

conda create -n nksr
conda activate nksr
conda install pytorch==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
conda install pytorch-scatter -c pyg
pip install nksr -f https://nksr.huangjh.tech/whl/torch-2.0.0+cu118.html
pip install requests==2.21 rich==13.4.2 
pip install oauthlib>=3.0.0 colorama click>=7.0.0 requests==2.28.2
pip install python-pycg[full]==0.5.2 -f https://pycg.huangjh.tech/packages/index.html
conda install pyntcloud -c conda-forge -c pyg -c pytorch

Somehow the full environment.yml install always led to a OSError. On a Ubuntu 22.04 laptop with RTX 3070 and CUDA 12.2

Mr-ZhuJun commented 5 months ago

非常感谢,最后的答案是正解!!

Kin-Zhang commented 4 months ago

For me, Ubuntu 20.04 is like the following:

mamba create -n nksr python=3.10
mamba activate nksr
mamba install pytorch==2.0.0 pytorch-cuda=11.8 cudatoolkit=11.8 -c pytorch -c nvidia
pip install --verbose torch-scatter==2.1.1
pip install --verbose torch-sparse==0.6.8
pip install nksr -f https://nksr.huangjh.tech/whl/torch-2.0.0+cu118.html
pip install requests==2.21 rich==13.4.2 
pip install oauthlib>=3.0.0 colorama click>=7.0.0 requests==2.28.2
pip install "python-pycg[full]==0.5.2" -f https://pycg.huangjh.tech/packages/index.html
mamba install pyntcloud -c pyg -c pytorch