MCG-NKU / E2FGVI

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)
Other
1.02k stars 97 forks source link

CUDA out of memory #66

Open SerhiiPostupaiev opened 1 year ago

SerhiiPostupaiev commented 1 year ago

Hello, @Paper99, @LGYoung, @NK-CS-ZZL I am trying to launch the evaluation script using CUDA GPU.

I ensured my PC has GPU enabled

>>> import torch

>>> torch.cuda.is_available()
True

>>> torch.cuda.device_count()
1

>>> torch.cuda.current_device()
0

>>> torch.cuda.device(0)
<torch.cuda.device at 0x7efce0b03be0>

>>> torch.cuda.get_device_name(0)
'GeForce GTX 950M'

I am using Windows 11 image

When I run the evaluation script, the following error is received

(ttt) F:\training\E2FGVI>python evaluate.py --model e2fgvi --dataset davis --data_root datasets/ --ckpt release_model/E2FGVI-CVPR22.pth
C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\mmcv\__init__.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
  'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
load pretrained SPyNet...
load checkpoint from http path: https://download.openmmlab.com/mmediting/restorers/basicvsr/spynet_20210409-c6c1bd09.pthLoading from: release_model/E2FGVI-CVPR22.pth
Start evaluation...
[Loading I3D model from ./release_model/i3d_rgb_imagenet.pt for FID score ..]
Traceback (most recent call last):
  File "evaluate.py", line 176, in <module>
    main_worker(args)
  File "evaluate.py", line 92, in main_worker
    pred_img, _ = model(masked_frames, len(neighbor_ids))
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "F:\training\E2FGVI\model\e2fgvi.py", line 255, in forward
    trans_feat = self.transformer(trans_feat)
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
    input = module(input)
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "F:\training\E2FGVI\model\modules\tfocal_transformer.py", line 523, in forward
    mask_all=x_window_masks_all)  # nW*B, T*window_size*window_size, C
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "F:\training\E2FGVI\model\modules\tfocal_transformer.py", line 394, in forward
    attn = self.softmax(attn)
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\modules\activation.py", line 1044, in forward
    return F.softmax(input, self.dim, _stacklevel=5)
  File "C:\Users\postu\miniconda3\envs\ttt\lib\site-packages\torch\nn\functional.py", line 1442, in softmax
    ret = input.softmax(dim)
RuntimeError: CUDA out of memory. Tried to allocate 668.00 MiB (GPU 0; 4.00 GiB total capacity; 2.36 GiB already allocated; 0 bytes free; 3.05 GiB reserved in total by PyTorch)

Is there something that can be done here? Or is my PC hardware is too weak to launch the evaluation process?

cmn1565080456 commented 1 year ago

This problem may be that your hardware performance is not enough to support the minimum performance requirements for training. You should be able to solve this problem by upgrading your graphics card and increasing the memory capacity.