how to inference a video？ - Githubissues

IDEA-Research / GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

https://arxiv.org/abs/2303.05499

Apache License 2.0

6.89k stars 697 forks source link

how to inference a video？ #233

Open IronmanVsThanos opened 1 year ago

IronmanVsThanos commented 1 year ago

how to inference a video？

rentainhe commented 1 year ago

You can extract each frame of the video and use Grounding-DINO to detect every frame, or you can try to use the open-world tracking model like DEVA to track objects on video data