Open lsn199603 opened 1 year ago
It is a good point. I believe we can improve the throughout by technique optimizations. It would be helpful if you'd like to provide PRs.
Hey @lsn199603 , does GroundingDINO work on live video captures?
Hey @lsn199603 , does GroundingDINO work on live video captures?
Hello, I only tested mp4 file video, not rstp video stream
Hey @lsn199603 , does GroundingDINO work on live video captures?
Hello, I only tested mp4 file video, not rstp video stream
Awesome thanks! Is the implementation for mp4 file video similar to YOLO video object detection implementation?
thanks
Yes, the prompt needs to be configured in advance
thanks
Yes, the prompt needs to be configured in advance
Thank you!
Have you made any progress on pre-encoding?
Hey @lsn199603, if you don’t mind, could you share the specifications you used to achieve 5 FPS? Specifically:
In my test, with an input image of 1200x1800, DINO detects 5 objects, and the prompt includes 13 categories (e.g., "xxx., yyy., zzz.,...") totaling 133 characters.
GroundingDINO Inference result is very good. However, the inference speed is 5FPS,Is it possible to improve the inference speed by pre-encoded text ? Looking forward to your reply!