Open kurtjcu opened 9 months ago
+1, similar issue also seen in splatfacto
with docker image "1.0.0" and "1.0.1". Here is the error I got:
I am getting similar issues as well for both 1.0.0 and 1.0.1 images :(
Hi, can you please try the nerfstudio/nerfstudio:latest
image?
That seems to do the trick for me, TYSM! I've been using gaussian-splatting
for the dromni/nerfstudio:main
image for a while now. Is there any difference or improvement with the splatfacto
method in the nerfstudio/nerfstudio:latest
image?
Edit: Just wanted to say I am running on an RTX 3060 12GB
Good to hear it works. I am still testing the image, so please let me know if you find any issues. Unfortunately, I don't have access to the Dockerfile used to compile dromni/nerfstudio:main, but the nerfstudio/nerfstudio:latest image just contains the current last commit in the main branch of nerfstudio.
I was having the same issue on all the latest images as well. I used docker to compile one from main branch and then it worked.
I was having the same issue on an NVIDIA T4, but the nerfstudio/nerfstudio:latest
image seems to work like a charm - thanks @jkulhanek .
May I ask what have you changed? Looking online it seems to be something related to the CUDA supported architectures, but it seems like the official nerfstudio docker image contains most common archs, including sm_75
which corresponds to the T4. So I'm just curious what specifically is addressing the issue here?
I solved it by modifying CUDA_ARCHITECTURES in Dockerfile and then building docker. (My GPU is NVIDIA RTX 3090)
ARG CUDA_ARCHITECTURES=90;89;86;80;75;70
I was having the same issue on an NVIDIA T4, but the
nerfstudio/nerfstudio:latest
image seems to work like a charm - thanks @jkulhanek .May I ask what have you changed? Looking online it seems to be something related to the CUDA supported architectures, but it seems like the official nerfstudio docker image contains most common archs, including
sm_75
which corresponds to the T4. So I'm just curious what specifically is addressing the issue here?
Sorry, but I don’t have access to dockerfiles used by dromni to build dromni/nerfstudio images. I don’t know which cuda archs they built with.
I solved it by modifying CUDA_ARCHITECTURES in Dockerfile and then building docker. (My GPU is NVIDIA RTX 3090)
ARG CUDA_ARCHITECTURES=90;89;86;80;75;70;61;52;37 ARG CUDA_ARCHITECTURES=90;89;86;80;75;70
I believe 3090 has cuda compute 75, so the default docker image should work just fine. Are you having issues?
I solved it by modifying CUDA_ARCHITECTURES in Dockerfile and then building docker. (My GPU is NVIDIA RTX 3090)
ARG CUDA_ARCHITECTURES=90;89;86;80;75;70;61;52;37 ARG CUDA_ARCHITECTURES=90;89;86;80;75;70
I believe 3090 has cuda compute 75, so the default docker image should work just fine. Are you having issues?
The Compute Capability of the GeForce RTX 3090 is 8.6 (https://developer.nvidia.com/cuda-gpus). Although it's hard to pinpoint the exact reason, when older versions such as 61;52;37 were built together in CUDA_ARCHITECTURES, the same issues as reported by other users occurred. Like everyone else, this solution was discovered through a lot of trial and error. :)
Describe the bug Current docker images with tags "main", "1.0.1", and "1.0.0" crash when training.
RuntimeError: CUDA error: no kernel image is available for execution on the device
To Reproduce Steps to reproduce the behavior: Do this on a machine with a 3090 (ubuntu server, nvidia driver support up to cuda 12.3)
Expected behavior Using the above command with container tag "0.3.4" functions correctly