Is 16GB memory not enough?

atx-barnes commented 2 years ago

Tried testing the fox scene but as soon as the window would open it would close with a resulting error shown below:

C:\Users\usr\src\repos\instant-ngp> ./build/testbed --scene data/nerf/fox
14:24:29 INFO     Loading NeRF dataset from
14:24:29 INFO       data\nerf\fox\transforms.json
14:24:29 SUCCESS  Loaded 67 images of size 1080x1920 after 0s
14:24:29 INFO       cam_aabb=[min=[0.983595,-1.33309,-0.378748], max=[2.46175,1.00721,1.43941]]
14:24:29 INFO     Loading network config from: configs\nerf\base.json
14:24:29 INFO     GridEncoding:  Nmin=16 b=1.51572 F=2 T=2^19 L=16
14:24:29 INFO     Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1
14:24:29 INFO     Color model:   3--[SphericalHarmonics]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3
14:24:29 INFO       total_encoding_params=13074912 total_network_params=10240
14:24:30 ERROR    Uncaught exception: Could not allocate memory: CUDA Error: cudaMalloc(&rawptr, n_bytes+DEBUG_GUARD_SIZE*2) failed with error out of memory

The documentation doesn't specify the minimum amount so I assume the Could not allocate memory error would be because of that? If not, what else could it be?

Tom94 commented 2 years ago

16GB should be plenty, actually. The fox dataset requires 7.56 GB on my machine. (see related discussion at #6)

Could it be that another program (or a particularly large screen setup) is reserving much VRAM in the background?

atx-barnes commented 2 years ago

Actually I'm using a 2080 which only has 8gb and you mentioned the fox demo uses around 7.50 so potentially could be causing my issue. Is there another demo that works using less memory?

pwais commented 2 years ago

Nerf Lego uses ~6.8GB according to nvidia-smi:

python3 scripts/run.py --scene=data/nerf_synthetic/lego/ --mode=nerf --screenshot_transforms=data/nerf_synthetic/lego/transforms_test.json --screenshot_w=800 --screenshot_h=800 --screenshot_dir=data/nerf_synthetic/lego/screenshots --save_snapshot=data/nerf_synthetic/lego/snapshot.msgpack --n_steps=1000

nerf_synthetic from the official source linked in the readme: https://drive.google.com/drive/folders/1JDdLGDruGNXWnM1eqY1FNL9PlStjaKWi

mmalex commented 2 years ago

hello! @atx-barnes we trimmed down the fox dataset simply by deleting a few of the frames that I randomly deemed were blurriest / most covered by other frames. The dataset still trains fine, and the memory reported is 7.3Gb however this does not include OS overhead and framebuffer. I am not sure if this is enough to bring the dataset under the 8Gb limit of your card, but it's worth trying again. Unfortunately I dont have an 8gb card to hand, to test.

If it does nt work, and you have time, feel free to locally delete a few image files from the data/nerf/fox/images folder, until it runs. It doesn't matter particularly which ones - if you can report the number of image files below which it runs, I can try to prune away the dataset to that number to help others.

Many thanks

pwais commented 2 years ago

It seems all the rays are always loaded into GPU memory, no? Instead of deleting images, could the optimizer instead have an option to use CPU memory for the rays and only batch them to GPU if necessary? that would also buy scalability to much bigger scenes; e.g. you'd probably need this to do Tanks and Temples on a 11GB card

Ben-Mack commented 2 years ago

@Tom94 @mmalex This is probably a bug because I'm having Uncaught exception: Could not allocate memory: CUDA Error: cudaMalloc(&rawptr, n_bytes+DEBUG_GUARD_SIZE*2) failed with error out of memory even though I've removed all but 4 image of nerf/fox (transforms.json is edited accordingly) on GTX 1080 8gb, win 10 , GPU vram still has 7.4gb free.

Details response:

build\testbed --scene data/nerf/fox
12:50:52 INFO     Loading NeRF dataset from
12:50:52 INFO       data\nerf\fox\transforms.json
12:50:52 SUCCESS  Loaded 4 images of size 1080x1920 after 0s
12:50:52 INFO       cam_aabb=[min=[1.47019,-1.33309,0.171354], max=[1.54556,-1.30823,0.185121]]
12:50:52 INFO     Loading network config from: configs\nerf\base.json
12:50:52 INFO     GridEncoding:  Nmin=16 b=1.51572 F=2 T=2^19 L=16
Warning: FullyFusedMLP is not supported for the selected architecture 61.Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+.
Warning: FullyFusedMLP is not supported for the selected architecture 61.Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+.
12:50:52 INFO     Density model: 3--[HashGrid]-->32--[FullyFusedMLP(neurons=64,layers=3)]-->1
12:50:52 INFO     Color model:   3--[SphericalHarmonics]-->16+16--[FullyFusedMLP(neurons=64,layers=4)]-->3
12:50:52 INFO       total_encoding_params=13074912 total_network_params=9728
12:50:53 ERROR    Uncaught exception: Could not allocate memory: CUDA Error: cudaMalloc(&rawptr, n_bytes+DEBUG_GUARD_SIZE*2) failed with error out of memory

The only working data is models in data/sdf. All other data from nerf_synthetic google drive is not working with the same error above.

mhfy6868 commented 2 years ago

@mmalex @Tom94 Thanks for helping solving the problem. I have exactly the same issue as Ben-Mack described with a RTX2070s 8gb. sdf works well, nerf not. By deleting a few of the frames,it doesn't work,and it always requires :

total_encoding_params=13074912 total_network_params=10240

And throws the error:

Uncaught exception: Could not allocate memory: CUDA Error: cudaMalloc(&rawptr, n_bytes+DEBUG_GUARD_SIZE*2) failed with error out of memory

What shall we do to solve this? Many thanks

kpiorno commented 2 years ago

@mmalex @Tom94 Thanks for helping solving the problem. I have exactly the same issue as Ben-Mack described with a RTX2070s 8gb. sdf works well, nerf not. By deleting a few of the frames,it doesn't work,and it always requires :

total_encoding_params=13074912 total_network_params=10240

And throws the error:

Uncaught exception: Could not allocate memory: CUDA Error: cudaMalloc(&rawptr, n_bytes+DEBUG_GUARD_SIZE*2) failed with error out of memory

What shall we do to solve this? Many thanks

Changing the aabb_scale in the JSON file to 2 with an RTX 3070 mobile 8GB VRAM worked for me.

NVlabs / instant-ngp

Is 16GB memory not enough? #14