RuntimeError: CUDA out of memory.

yanliang-wang commented 3 years ago

Thank you for the excellent work. It looks very cool. I was trying to run it, but there was some error about the memory of GPU. The log with the error is:

2d detection path: /home/wang/wang/git_files/test/CLOCs/d2_detection_data/data
sparse_shape: [  41 1600 1408]
num_class is : 1
load existing model
Restoring parameters from /home/wang/wang/git_files/test/CLOCs/model_dir/adam_optimizer-2.tckpt
{'Car': 5}
[-1]
load 14357 Car database infos
load 2207 Pedestrian database infos
load 734 Cyclist database infos
load 1297 Van database infos
load 56 Person_sitting database infos
load 488 Truck database infos
load 224 Tram database infos
load 337 Misc database infos
After filter database:
load 10520 Car database infos
load 2104 Pedestrian database infos
load 594 Cyclist database infos
load 826 Van database infos
load 53 Person_sitting database infos
load 321 Truck database infos
load 199 Tram database infos
load 259 Misc database infos
remain number of infos: 3712
remain number of infos: 3769
WORKER 0 seed: 1615082231
WORKER 1 seed: 1615082232
WORKER 2 seed: 1615082233
Traceback (most recent call last):
  File "./pytorch/train.py", line 926, in <module>
    fire.Fire()
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
    target=component.__name__)
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "./pytorch/train.py", line 379, in train
    raise e
  File "./pytorch/train.py", line 248, in train
    all_3d_output_camera_dict, all_3d_output, top_predictions, fusion_input,tensor_index = net(example_torch,detection_2d_path)
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wang/wang/git_files/test/CLOCs/second/pytorch/models/voxelnet.py", line 310, in forward
    preds_dict = self.rpn(spatial_features)
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wang/wang/git_files/test/CLOCs/second/pytorch/models/rpn.py", line 314, in forward
    ups.append(self.deblocks[i](x))
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wang/wang/git_files/test/CLOCs/torchplus/nn/modules/common.py", line 89, in forward
    input = module(input)
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 106, in forward
    exponential_average_factor, self.eps)
  File "/home/wang/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/functional.py", line 1923, in batch_norm
    training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 36.00 MiB (GPU 0; 3.94 GiB total capacity; 2.45 GiB already allocated; 80.38 MiB free; 2.53 GiB reserved in total by PyTorch)

My GPU is NVIDIA GTX 1050TI with 4GB memory. Is it because that my GPU memory is not enough? Can I solve the problem by some operation without upgrading my GPU? And how much is the lowest requirement for GPU memory of the project?

pangsu0613 commented 3 years ago

Hello, @yanliang-wang , Thank you for your interests in CLOCs! Yes, you are right, it is because your GPU memory is too small. Although CLOCs fusion itself does not need much memory, but in this version of implementation, I also need to run the 3D detector (SECOND-V1.5) on the fly, that's the part that takes most of the memory. Currently, running CLOCs needs around 4400 MB GPU memory for training, and around 2500 MB for inference.

yanliang-wang commented 3 years ago

Hello, @yanliang-wang , Thank you for your interests in CLOCs! Yes, you are right, it is because your GPU memory is too small. Although CLOCs fusion itself does not need much memory, but in this version of implementation, I also need to run the 3D detector (SECOND-V1.5) on the fly, that's the part that takes most of the memory. Currently, running CLOCs needs around 4400 MB GPU memory for training, and around 2500 MB for inference.

Hello, thank you for the helpful reply. I’ll try it with a new GPU .

MakerFace commented 1 year ago

Hello, @yanliang-wang , Thank you for your interests in CLOCs! Yes, you are right, it is because your GPU memory is too small. Although CLOCs fusion itself does not need much memory, but in this version of implementation, I also need to run the 3D detector (SECOND-V1.5) on the fly, that's the part that takes most of the memory. Currently, running CLOCs needs around 4400 MB GPU memory for training, and around 2500 MB for inference.

Have you measured how much memory is needed for CLOCs fusion? how can I measured it?

pangsu0613 / CLOCs

RuntimeError: CUDA out of memory. #17