Closed diamonazreal closed 7 months ago
Hi @diamonazreal, I am sorry about the missing RING++ module in the image in dockerhub. I'll update this image ASAP.
Thanks
Meanwhile, I have the following error when using the docker image you provided:
**_[ERROR] [1697637462.094589, 1648732982.415934]: bad callback: <function callback3 at 0x7fd3517eb040>
Traceback (most recent call last):
File "/opt/ros/noetic/lib/python3/dist-packages/rospy/topics.py", line 750, in _invoke_callback
cb(msg)
File "main.py", line 361, in callback3
pc_bev, pc_RING, pc_TIRING, _ = generate_RING(pc_normalized)
File "/home/LoopDetection/src/RING_ros/util.py", line 296, in generate_RING
pc_RING_normalized = fn.normalize(pc_RING, mean=pc_RING.mean(), std=pc_RING.std())
RuntimeError: CUDA error: no kernel image is available for execution on the device
**[ERROR] [1697637462.360664, 1648732982.677983]: bad callback: <function callback1 at 0x7fd3517e5ee0>
Traceback (most recent call last):
File "/opt/ros/noetic/lib/python3/dist-packages/rospy/topics.py", line 750, in _invoke_callback
cb(msg)
File "main.py", line 262, in callback1
pc_bev, pc_RING, pc_TIRING, _ = generate_RING(pc_normalized)
File "/home/LoopDetection/src/RING_ros/util.py", line 296, in generate_RING
pc_RING_normalized = fn.normalize(pc_RING, mean=pc_RING.mean(), std=pc_RING.std())
RuntimeError: CUDA error: no kernel image is available for execution on the device_**
Can you provide some suggestions to solve it? Thank you again!
@diamonazreal ,this issue is related to the mismatched version of the GPU driver and the CUDA, can you provide detailed information about your GPU driver and the GPU version. Also, you might need to use the run.bash scripts in docker to start a container that links the host GPU to the container.
My GPU is GTX3060, and the driver versions using 515 and 470 respectively have the above problems. docker's container is built using run.sh, and the image is MaverickPeter/MR SLAM
@diamonazreal, I've updated the docker image in dockerhub with the latest update of this repo. Would you kindly check to see whether the CUDA problem persists with the new version?
Hello, I use your newly provided image for Quick Demo test, but the above problems still exist @MaverickPeter
Based on this answer, I suggest you to force reinstall the PyTorch by running 'pip3 install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113'
@MaverickPeter Hello, I executed the instruction in docker according to your suggestion, but this error still exists. At the same time, I tested other CUDA test code found on the Internet, and did not have the above problems
@MaverickPeter Hello, I executed the instruction in docker according to your suggestion, but this error still exists. At the same time, I tested other CUDA test code found on the Internet, and did not have the above problems
@diamonazreal It seems the problem is caused by the torch? Have you tried to upgrade the torch version to 1.12.1, you may use the command below: 'pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113'
@MaverickPeter Hello, I executed the instruction in docker according to your suggestion, but this error still exists. At the same time, I tested other CUDA test code found on the Internet, and did not have the above problems
@diamonazreal It seems the problem is caused by the torch? Have you tried to upgrade the torch version to 1.12.1, you may use the command below: 'pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113'
I tried using a 2060GPU and replicated it successfully. Can't this algorithm run on a 30-series GPU?
It might be some compatible problem with cuda and pytorch. I'll try to replicate this issue on a 30-series GPU and find a solution.
@diamonazreal, I've reproduced this issue on the 30-series GPU. The problem is caused by the open-sourced radon transform implementation, which might have a compatible problem. I'll dig into it and try to fix it.
@MaverickPeter Thank you for trying
@diamonazreal I downgraded the environment in the docker image to cuda 11.1.1 with pytorch 1.10.1 and everything went well. I've already updated the docker image and you can now download it via docker pull.
@MaverickPeter Hi, I'm experiencing the same issue with the latest docker image, my GPU is 4060 and the driver version is 535.129.03 ! 1 Do you have some suggestions to fix this issue?
When I run it locally, torch1.12.1+cu113 works fine, but when I use the latest docker image, I encounter the same error, and when I change the docker image to torch1.12.1+cu113, it's still the same error
@Joosoo1 You can try to upgrade the cuda to 12.x and utilize the corresponding torch version. The compute capability of 4060 is 8.9 (sm_89), and CUDA 11.x only supports <= sm_86. But I have no idea why the code works fine locally.
Good news!When I docker pull the image directly, RING++ works fine, while I build the image from dockerfile and run RING++ encountering the above error.
Cool, There are several minor changes when I build the docker image from the dockerfile, I'll check it later.
Traceback (most recent call last):
File "main.py", line 10, in
Hello, first of all, thank you and your team for sharing, but when I used docker verification algorithm, I found that there was no module containing RING++. Could you please provide the image or construction method containing RING++? Thank you again for your sharing