hkchengrex / Mask-Propagation

[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.
https://hkchengrex.github.io/MiVOS/
MIT License
128 stars 22 forks source link

Has anyone experienced CUDA out of memory? #8

Closed Meryl-Fang closed 3 years ago

Meryl-Fang commented 3 years ago

Hi! Thank you for sharing this great code. I was wondering on what machine did you use for training and inference. I was using it to infer on my own data, and some of the bigger video sequence yield CUDA out of memory (v100 16gb). I also tried to load the model to fp16, but I feel like the accuracy was compromised because some folders did not have anything segmented.

Please let me know if there's anything you'd suggest us trying. Thank you very much!

hkchengrex commented 3 years ago

The easiest way is to reduce memory frequency (which means increasing the number mem_freq): https://github.com/hkchengrex/Mask-Propagation/blob/b5d8e61c87e1944426c3aed43685fd0eb336353a/inference_core.py#L20

The second way (not sure if you are doing this already) is to not cache the input video on GPU and loads a frame to GPU only when needed (this is done in our YouTubeVOS evaluation). This is important as the video itself is large.

Also, it suffices to use 480p videos, you might want to downsize your videos if needed.

I haven't tried fp16, but I am surprised that it didn't work...

hkchengrex commented 3 years ago

And we used two 1080Ti for training and a 2080Ti for evaluation. All are 11GB GPUs. We only used fp32.

Meryl-Fang commented 3 years ago

Thank you for the detailed responses! We tried both and worked.