yenchenlin / nerf-supervision-public

MIT License
206 stars 18 forks source link

Out of GPU Memory Error #9

Closed dedoogong closed 2 years ago

dedoogong commented 2 years ago

Hello! Thank you for your amazing work! I tried to follow your "fork" example but I met the memory error as below(I'm using 8 V100s(32GB per 1 gpu)). How much memory do I need to have?

Loaded image data (3024, 4032, 3, 38) [3024. 4032. 384.3852275] Loaded ./data/fork 2.8670193392574848 13.09573857906798 bds: [2.8920324 9.21226 ] Data: (38, 3, 5) (38, 3024, 4032, 3) (38, 2) HOLDOUT view is 7 Loaded llff (38, 3024, 4032, 3) (120, 3, 5) [3024. 4032. 384.38522] ./data/fork Auto LLFF holdout, 8 DEFINING BOUNDS NEAR FAR 1.1999999284744263 6.090291976928711 Found ckpts [] Not ndc! RENDER ONLY test poses shape torch.Size([33, 3, 4]) 0%| | 0/33 [00:00<?, ?it/s]0 0.0033724308013916016 /opt/conda/lib/python3.8/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1646755903507/work/aten/src/ATen/native/TensorShape.cpp:2228.) return _VF.meshgrid(tensors, kwargs) # type: ignore[attr-defined] torch.Size([1512, 2016, 3]) torch.Size([1512, 2016]) max: 6.07417 3%|███▋ | 1/33 [01:49<58:17, 109.31s/it]1 109.31095910072327 max: 6.0749073 6%|███████▎ | 2/33 [03:38<56:18, 108.99s/it]2 108.7608253955841 6%|███████▏ | 2/33 [05:11<1:20:24, 155.63s/it] Traceback (most recent call last): File "DS_NeRF/run_nerf.py", line 1058, in train() File "DS_NeRF/run_nerf.py", line 810, in train rgbs, disps = render_path(render_poses, hwf, args.chunk, render_kwargs_test, gt_imgs=images, savedir=testsavedir, render_factor=args.render_factor) File "DS_NeRF/run_nerf.py", line 173, in render_path rgb, disp, acc, depth, extras = render(H, W, focal, chunk=chunk, c2w=c2w[:3,:4], retraw=True, render_kwargs) File "DS_NeRF/run_nerf.py", line 136, in render all_ret = batchify_rays(rays, chunk, **kwargs) File "DS_NeRF/run_nerf.py", line 73, in batchify_rays all_ret = {k : torch.cat(all_ret[k], 0) for k in all_ret} File "DS_NeRF/run_nerf.py", line 73, in all_ret = {k : torch.cat(all_ret[k], 0) for k in all_ret} RuntimeError: CUDA out of memory. Tried to allocate 5.81 GiB (GPU 0; 31.75 GiB total capacity; 20.90 GiB already allocated; 2.74 GiB free; 27.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Thank you.