Vision-CAIR / LongVU

https://vision-cair.github.io/LongVU
296 stars 21 forks source link

Application on very long video / streaming media such as CCTV #8

Open tamal-thetaonelab opened 3 weeks ago

tamal-thetaonelab commented 3 weeks ago

What can be a good approach for handling streaming media / very long footages ?

What I understood, currently the model can handle videos of an hour length or so. Keeping this restriction, how can I implement a sliding window (an hour, say) on a streaming CCTV footage. Let's say I want queries to be answered always on last 1 hour of the video.

Any advise is greatly appreciated.

xiaoqian-shen commented 3 weeks ago

Thanks for your interest in our approach. From what I understand, you're mainly interested in maintaining only the latest 1 hour of footage without needing to keep the memory of the historical data, right? In this case, you can truncate from the left of the input sequence to maintain a moving 1-hour window for the latest streaming frames.