facebookresearch / Mask2Former

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
MIT License
2.46k stars 372 forks source link

Video segmentation crashes with more than 100 frames #138

Open fschvart opened 2 years ago

fschvart commented 2 years ago

I have a video I want to try to run inference on, it's a 50s video at 60 fps, so 3000 frames in total. I try to run inference with my RTX3090 but I can't get past the 100 frames, otherwise I automatically get a CUDA OOM message. I was wondering if there's anything I can do to complete a full video inference on an RTX3090 or whether I'm limited to running it at a 100 frames batches.

Thanks

bowenc0221 commented 1 year ago

You can split the video into multiple clips and merge predictions afterwards. However, this is not supported in this code.