Closed Meryl-Fang closed 3 years ago
The easiest way is to reduce memory frequency (which means increasing the number mem_freq
): https://github.com/hkchengrex/Mask-Propagation/blob/b5d8e61c87e1944426c3aed43685fd0eb336353a/inference_core.py#L20
The second way (not sure if you are doing this already) is to not cache the input video on GPU and loads a frame to GPU only when needed (this is done in our YouTubeVOS evaluation). This is important as the video itself is large.
Also, it suffices to use 480p videos, you might want to downsize your videos if needed.
I haven't tried fp16, but I am surprised that it didn't work...
And we used two 1080Ti for training and a 2080Ti for evaluation. All are 11GB GPUs. We only used fp32.
Thank you for the detailed responses! We tried both and worked.
Hi! Thank you for sharing this great code. I was wondering on what machine did you use for training and inference. I was using it to infer on my own data, and some of the bigger video sequence yield CUDA out of memory (v100 16gb). I also tried to load the model to fp16, but I feel like the accuracy was compromised because some folders did not have anything segmented.
Please let me know if there's anything you'd suggest us trying. Thank you very much!