WXinlong / SOLO

SOLO and SOLOv2 for instance segmentation, ECCV 2020 & NeurIPS 2020.
Other
1.69k stars 307 forks source link

The SOLO framework seems to eat a lot of GPU memory #162

Open marearth opened 3 years ago

marearth commented 3 years ago

when I try "python tools/train.py configs/solo/solo_r50_fpn_8gpu_1x.py" command and reduce image per gpu and worker per gpu to 1 at the same time, I still get error message "CUDA out of memory" for one 10.92 GB 1080Ti GPU. It's favored that memory consumption of gpu is provided for different models before others try to repeat the experiments. Could you give some references about gpu memory consumption?

zhuaiyi commented 3 years ago

I've met the same problem:

[>>>>>>>>>>>>> ] 20/76, 0.3 task/s, elapsed: 59s, ETA: 165s Traceback (most recent call last): ... RuntimeError: CUDA out of memory. Tried to allocate 3.30 GiB (GPU 0; 8.00 GiB total capacity; 973.14 MiB already allocated; 2.13 GiB free; 3.74 GiB reserved in total by PyTorch)

Then I shrinked my test set to 14 images, same error occurred when [>> ] 2/14.

What should I do with that?

marearth commented 3 years ago

I solved the problem by using bigger GPU capacity. Minimum capacity of GPU per core seems to be 16GB

zhuaiyi commented 3 years ago

I solved the problem by using bigger GPU capacity. Minimum capacity of GPU per core seems to be 16GB

Do you mean that you changed a better GPU? That seems too expensive for me...

WXinlong commented 3 years ago

@marearth SOLOv2 is much more GPU memory efficient. Please try SOLOv2 instead.

avinash-asink commented 3 years ago

@WXinlong for SOLOv2 with R50_3x also i am getting same issue. RuntimeError: CUDA out of memory. Tried to allocate 1.00 GiB (GPU 0; 11.17 GiB total capacity; 9.88 GiB already allocated; 503.81 MiB free; 10.23 GiB reserved in total by PyTorch)