How to Perform Inference on Multiple Videos?

Hi, thanks for your great work!

I find that the inference code only supports processing one video at a time. I modified the code to process multiple videos, like processing an entire dataset. However, I found that while the results for the first video were good, some problems occurred when processing the second and subsequent videos. I tried adjusting some parameter settings and found that this phenomenon might be related to buffer/memory issues. How should I modify my code?

I use 'online' mode and 'text-prompted'. The 'semionline' mode has the same problem.

The result of the first video's first frame:

Then, process the second video's first frame:

If I perform inference on the second video directly (no video was processed before), the result of the first frame looks good:

hkchengrex / Tracking-Anything-with-DEVA

How to Perform Inference on Multiple Videos? #99