open-mmlab / mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark
https://mmpretrain.readthedocs.io/en/latest/
Apache License 2.0
3.49k stars 1.08k forks source link

[Bug] when i use vis_cam.py , i got an error of out _of_cuda_memory #1664

Open equinox-lyl opened 1 year ago

equinox-lyl commented 1 year ago

Branch

main branch (mmpretrain version)

Describe the bug

i have modified the vis_cam.py to add args.imgroot to support input images dir, when i use vis_cam.py on cuda to infer more than one image to save cam images, i can only save first cam image ,after that i got an error of out _of_cuda_memory, what can i do to solve it? i have try to add torch.cuda.empty_cache(), but it didn't work.

CUDA_VISIBLE_DEVICES=0 python tools/visualization/vis_cam.py "/home/user/mocov2_resnet50.py" "/home/user/mmpretrain/resnet50/epoch_200.pth" --imgroot /home/user/mmpretrain/testimgs/ --target-layers 'backbone.layer4' --save-path "/home/user/mmpretrain/viscam/mocov2_r50/"
processing img 1/300
process done
processing img 2/300
Traceback (most recent call last):
  File "tools/visualization/vis_cam.py", line 360, in <module>
    main()
  File "tools/visualization/vis_cam.py", line 339, in main
    aug_smooth=args.aug_smooth)
  File "/home/user/.local/lib/python3.7/site-packages/pytorch_grad_cam/base_cam.py", line 197, in __call__
    targets, eigen_smooth)
  File "/home/user/.local/lib/python3.7/site-packages/pytorch_grad_cam/base_cam.py", line 92, in forward
    loss.backward(torch.ones_like(t(o)),retain_graph=True)
RuntimeError: CUDA out of memory. Tried to allocate 6.25 GiB (GPU 0; 23.70 GiB total capacity; 13.46 GiB already allocated; 3.02 GiB free; 19.04 GiB reserved in total by PyTorch)

Environment

{'CUDA available': True, 'CUDA_HOME': None, 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0', 'GPU 0,1,2,3': 'NVIDIA GeForce RTX 3090', 'MMCV': '2.0.0rc4', 'MMEngine': '0.7.2', 'MMPreTrain': '1.0.0rc7+', 'OpenCV': '4.2.0', 'PyTorch': '1.9.0+cu111', 'Python': '3.7.9 (default, Aug 31 2020, 12:42:55) [GCC 7.3.0]', 'TorchVision': '0.10.0+cu111', 'numpy_random_seed': 2147483648, 'sys.platform': 'linux'}

Other information

No response

Ezra-Yu commented 1 year ago

please try a smaller resolution or a smaller model.