Open fatsoengineer opened 4 years ago
Your GPU is running out of memory. You can try these options :
1- Use torch.cuda.empty_cache()
to clear memory cache
2- Kill process using GPU. (Use nvidia-smi to find out the process' name). If it's python use sudo pkill python
to kill it.
3- You are using large batches. Try reducing the batch size.
@mejdidallel Where to put the "torch. CUDA. Empty cache()"? I met the same question,but I don't know where it can be put. Can you give me some advice,thank you.
@mejdidallel Where to put the "torch. CUDA. Empty cache()"? I met the same question,but I don't know where it can be put. Can you give me some advice,thank you.
Just put it in the file you are running while getting the error to clear cache everytime you run it. Or you can simply open Python terminal and type import torch
then torch.cuda.empty_cache()
.
Load configuration information from configs/recognition/st_gcn_aaai18/ntu-rgbd-xsub/test.yaml Downloading: "https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmskeleton/models/st-gcn/st_gcn.ntu-xsub-300b57d4.pth" to /root/.cache/torch/checkpoints/st_gcn.ntu-xsub-300b57d4.pth 100% 11.9M/11.9M [00:02<00:00, 4.69MB/s] terminal width is too small (0), please consider widen the terminal for better progressbar visualization [ ] 0/16487, elapsed: 0s, ETA:Traceback (most recent call last): File "mmskl.py", line 121, in
main()
File "mmskl.py", line 115, in main
call_obj(cfg.processor_cfg)
File "/content/drive/My Drive/Backup/Graphs/pose_detection/mmskeleton/mmskeleton/utils/importer.py", line 24, in call_obj
return import_obj(type)(kwargs)
File "/content/drive/My Drive/Backup/Graphs/pose_detection/mmskeleton/mmskeleton/processor/recognition.py", line 33, in test
output = model(data).data.cpu().numpy()
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], *kwargs[0])
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(input, kwargs)
File "/content/drive/My Drive/Backup/Graphs/pose_detection/mmskeleton/mmskeleton/models/backbones/st_gcnaaai18.py", line 90, in forward
x, = gcn(x, self.A importance)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(input, *kwargs)
File "/content/drive/My Drive/Backup/Graphs/pose_detection/mmskeleton/mmskeleton/models/backbones/st_gcn_aaai18.py", line 203, in forward
x, A = self.gcn(x, A)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(input, **kwargs)
File "/content/drive/My Drive/Backup/Graphs/pose_detection/mmskeleton/mmskeleton/ops/st_gcn/gconv_origin.py", line 63, in forward
x = torch.einsum('nkctv,kvw->nctw', (x, A))
File "/usr/local/lib/python3.6/dist-packages/torch/functional.py", line 201, in einsum
return torch._C._VariableFunctions.einsum(equation, operands)
RuntimeError: CUDA out of memory. Tried to allocate 5.49 GiB (GPU 0; 11.17 GiB total capacity; 7.38 GiB already allocated; 2.53 GiB free; 953.11 MiB cached)