Closed Brin333 closed 1 year ago
Hey I got the same error and here is how I fixed it. Use Conda environment. Please remove your old environment and create new as follows and you should be good:
Give it a name and please use python3.8 and no later since newer version is not supported by a few dependencies.
$ conda create -n
$ conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
$ pip install wandb $ pip install --ignore-installed PyYAML $ pip install open3d $ pip install multimethod $ pip install termcolor $ pip install trimesh $ pip install easydict
$ conda install -c "nvidia/label/cuda-11.7.0" cuda
$ cd external_libs/pointops & python setup.py install
Now you are good to. :)
Right now I am trying to figure out how to visualise and save the predicted results in .obj file. If someone can help that would be great. Really appreciate the great work.
@ZauraizAlamgeer thank you very much, i have fixed! really appraciate!
thank you very much, appreciate it.
limhoyeon @.***> 于2023年11月24日周五 08:45写道:
Closed #16 https://github.com/limhoyeon/ToothGroupNetwork/issues/16 as completed.
— Reply to this email directly, view it on GitHub https://github.com/limhoyeon/ToothGroupNetwork/issues/16#event-11051763794, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASMSRQF3ZUAXP7ZYDZOLYZTYF7U3VAVCNFSM6AAAAAA4EGSHLGVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGA2TCNZWGM3TSNA . You are receiving this because you authored the thread.Message ID: @.***>
The error: _File"/ToothGroupNetwork/models/modules/cbl_point_transformer/cbl_point_transformermodule.py", line 104, in forward p0 = pxo[:,:,:3].reshape(-1, 3).contiguous() RuntimeError: CUDA error: no kernel image is available for execution on the device
Hi @coordxyz, I have RTX 3070 and the instructions mentioned in my comment work like a charm. Use Python 3.8 in conda environment.
If you still wish to keep your mentioned versions, then check if the following commands returns correct cuda version: $ nvcc --version
If you see cuda 11.3 then run the following: $ export CUDA_HOME= /path/to/your/cuda/
If you still face issues then I recommend to follow my previous instructions and it will definitely work. Since you have RTX 3080Ti, you will need to add the following lines in external_libs/pointops/setup.py:
$ extra_compile_args={'cxx': ['-g'], 'nvcc': ['-O2', '-gencode', 'arch=compute_61,code=sm_61', '-gencode', 'arch=compute_75,code=sm_75', '-gencode', 'arch=compute_86,code=sm_86']}
Last resort, if $ nvcc --version returns empty, then run: $ sudo apt-get install nvidia-cuda-toolkit
This is because pytorch installs cuda in runtime only and installing via apt-get will resolve symbolic link issues and install whole toolkit.
Let me know if you need any more help.
BTW, I save the predicted result as *.obj file by simply adding gu.save_mesh('output.obj', gu.get_colored_mesh(mesh, pred_labels)) in the end of eval_visualize_results.py.
Hi @ZauraizAlamgeer , I have RTX 4090 .I installed the environment according to the method you provided, but still reported the error
Traceback (most recent call last):
File "preprocess_data.py", line 56, in
Hi @ZauraizAlamgeer , thanks a lot for your advice. I finally successfully create the environment by using pip rather than conda to install pytorch.
GPU: RTX 3080Ti
Python: 3.7.7 cuda: 11.3 open3d: 0.9.0 torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 BTW, I save the predicted result as *.obj file by simply adding gu.save_mesh('output.obj', gu.get_colored_mesh(mesh, pred_labels)) in the end of eval_visualize_results.py.
are you ubuntu or windows?
你好@ZauraizAlamgeer,非常感谢你的建议。我最终成功创建了环境,使用 pip 而不是 conda 来安装 pytorch。
GPU:RTX 3080Ti
Python:3.7.7 cuda:11.3 open3d:0.9.0 torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 顺便说一句,我只需在 eval_visualize_results.py 末尾添加 gu.save_mesh('output.obj', gu.get_colored_mesh(mesh, pred_labels)) 即可将预测结果保存为 *.obj 文件。
你好,我在linux系统下,显卡也是3080ti,环境为pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0,但是在训练tsegnet的之心预测模型时,损失值全为nan,你知道这是什么原因吗
更新一下,我使用的服务器3090显卡,试用一下命令安装了torch和cuda: pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113,已解决cuda error报错信息
Hi @ZauraizAlamgeer I have the same RuntimeError: CUDA error: no kernel image is available for execution on the device. despite following the same procedures and package versions
GPU: RTX 3090 Python: 3.8 cuda: 11.0 open3d: 0.18.0 torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.0a0+a853dff
would you please help me have a look? Thank you so much!
cd external_libs/pointops & python setup.py install
Hi @ZauraizAlamgeer I have the same RuntimeError: CUDA error: no kernel image is available for execution on the device. despite following the same procedures and package versions
GPU: RTX 3090 Python: 3.8 cuda: 11.0 open3d: 0.18.0 torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.0a0+a853dff
would you please help me have a look? Thank you so much!
I can't really use pip instead of conda (like other users who solved the problem), because I don't have the root permission as a sudo user. So the cudatoolkit can only be installed through conda.
+-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3090 Off | 00000000:18:00.0 Off | N/A | | 39% 34C P8 19W / 350W | 10MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 1 NVIDIA GeForce RTX 3090 Off | 00000000:51:00.0 Off | N/A | | 39% 32C P8 17W / 350W | 10MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 2 NVIDIA GeForce RTX 3090 Off | 00000000:8A:00.0 Off | N/A | | 38% 33C P8 17W / 350W | 10MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ | 3 NVIDIA GeForce RTX 3090 Off | 00000000:C3:00.0 Off | N/A | | 42% 33C P8 23W / 350W | 10MiB / 24576MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB | | 1 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB | | 2 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB | | 3 N/A N/A 2788 G /usr/lib/xorg/Xorg 4MiB | +-----------------------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_May__3_18:49:52_PDT_2022 Cuda compilation tools, release 11.7, V11.7.64 Build cuda_11.7.r11.7/compiler.31294372_0
Under the 'ToothGroupNetwork-challenge_branch' fold, exec the code: python inference_final.py --input_path ./tmp --save_path ./results
Error:: Error(s) in loading state_dict for TfCblFirstModule: While copying the parameter named "first_ins_cent_model.enc1.0.linear.weight", whose dimensions in the model are torch.Size([32, 6]) and whose dimensions in the checkpoint are torch.Size([32, 6]), an exception occurred : ('CUDA error: no kernel image is available for execution on the device',).