rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
https://rese1f.github.io/MovieChat/
BSD 3-Clause "New" or "Revised" License
531 stars 41 forks source link

Questions about model inference speed #67

Open LanXingXuan opened 5 months ago

LanXingXuan commented 5 months ago

Using an A100 for inference, it takes nearly three minutes to process a video of about 1,000 seconds. Is this normal?

After some tests, it was found that a lot of time was consumed in video segmentation and saving. Is there any way to optimize the inference speed?