creiser / kilonerf

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
474 stars 51 forks source link

About inference #1

Open thunanguyen opened 3 years ago

thunanguyen commented 3 years ago

Hi! Currently, I encounter a bug. It's ok when I do the training but when I do the inference, it says that: GPUassert: too many resources requested for launch network_eval.cu 292

My GPU is RTX Titan My CUDA version is 11.1 My CuDNN version is 8 My OS is Ubuntu 18.04

creiser commented 3 years ago

Hi! Yeah, we experienced the same problem on an RTX 2080 Ti, which is probably quite close to your RTX Titan. Despite these cards being newer than the GTX 1080 Ti we used, there seems to a shortage of a certain resource. Might be that the compiler uses more registers for these newer cards (more than there are available). I will look into it.

thunanguyen commented 3 years ago

thanks for your response, I have been looking into this for weeks now and still don't know what happens. Maybe because of the ray-tracing function of RTX GPUS?

creiser commented 3 years ago

An easy fix should be to run the network_eval kernel with fewer threads per block, e.g. 512 instead of 640, but then the performance suffers. It should also be possible to run it on a newer GPU with the same block size. Maybe they reduced some resources for these ray-tracining capabilities, but I also checked and the specification says that these cards still have the same register count and shared memory.

creiser commented 3 years ago

I fixed the problem and it now runs on a RTX 2080 Ti, so it should also for you. Despite the suboptimal fix, I measured on the Lego scene 17 ms with the RTX 2080 Ti.

e79af85

bruinxiong commented 3 years ago

@creiser It's weird. I check my nvidia 1080Ti the same as yours, CC is 6.1. Even I set fewer threads per block, such as 512 rather than 640. I still meet GPUassert: too many resources requested for launch network_eval.cu 292. I have to decrease to 256 to suffer the performance discount.

creiser commented 3 years ago

@bruinxiong Yeah, that does not make sense with a GTX 1080 Ti. Did you use the pre-compiled CUDA extension or did you compile the code yourself? In case you have compiled the extension yourself this problem might be caused by an old version of the CUDA Toolkit. Just to be safe make sure to use a recent driver version as well (but this shouldn't be the cause) If your PC has multiple GPUs make sure that the right one is being used, the program prints out the used GPU in the beginning.

bruinxiong commented 3 years ago

@bruinxiong Yeah, that does not make sense with a GTX 1080 Ti. Did you use the pre-compiled CUDA extension or did you compile the code yourself? In case you have compiled the extension yourself this problem might be caused by an old version of the CUDA Toolkit. Just to be safe make sure to use a recent driver version as well (but this shouldn't be the cause) If your PC has multiple GPUs make sure that the right one is being used, the program prints out the used GPU in the beginning.

@creiser Hi, I compiled the code by myself. I use the latest CUDA Toolkit 11.2.0. I have 3 GTX 1080Ti GPUs in my PC. I use the default GPU (GPU0) to render in the interactive viewer mode. I only set 256 threads per block, once I increase it larger than 256 the above error will emerge.

creiser commented 3 years ago

Can you try to run it with the precompiled extension, please?

Am Fr., 30. Juli 2021 um 15:08 Uhr schrieb Xiong Lin < @.***>:

@bruinxiong https://github.com/bruinxiong Yeah, that does not make sense with a GTX 1080 Ti. Did you use the pre-compiled CUDA extension or did you compile the code yourself? In case you have compiled the extension yourself this problem might be caused by an old version of the CUDA Toolkit. Just to be safe make sure to use a recent driver version as well (but this shouldn't be the cause) If your PC has multiple GPUs make sure that the right one is being used, the program prints out the used GPU in the beginning.

@creiser https://github.com/creiser Hi, I compiled the code by myself. I use the latest CUDA Toolkit 11.2.0. I have 3 GTX 1080Ti GPUs in my PC. I use the default GPU (GPU0) to render in the interactive viewer mode. I only set 256 threads per block, once I increase it larger than 256 the above error will emerge.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/creiser/kilonerf/issues/1#issuecomment-889881275, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABHOBK7P7TIKHEAMYTZVCLT2KP35ANCNFSM46MRGO3Q .