I failed to train the scannet scene241 at GeForce RTX 3080 10GB.
I run the command bash dev_scripts/w_scannet_etf/scene241.sh without the checkpoints/scannet/scene241/* provided. Then run out of gpu memory
Traceback (most recent call last):
File "train_ft.py", line 1091, in <module>
main()
File "train_ft.py", line 947, in main
model.optimize_parameters(total_steps=total_steps)
File "/home/touch/PycharmProjects/myfork/pointnerf/run/../models/neural_points_volumetric_model.py", line 217, in optimize_parameters
self.backward(total_steps)
File "/home/touch/PycharmProjects/myfork/pointnerf/run/../models/mvs_points_volumetric_model.py", line 104, in backward
self.loss_total.backward()
File "/home/touch/miniconda3/envs/pointnerf/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/home/touch/miniconda3/envs/pointnerf/lib/python3.8/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 588.00 MiB (GPU 0; 9.77 GiB total capacity; 5.89 GiB already allocated; 247.06 MiB free; 7.63 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
end loading
I failed to train the scannet scene241 at GeForce RTX 3080 10GB. I run the command
bash dev_scripts/w_scannet_etf/scene241.sh
without the checkpoints/scannet/scene241/* provided. Then run out of gpu memory