THU-MIG / RepViT

RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything
https://arxiv.org/abs/2307.09283
Apache License 2.0
799 stars 60 forks source link

[project] grounded-repvit-sam demo support #29

Closed rentainhe closed 11 months ago

rentainhe commented 11 months ago

Hello! Thanks a lot for your great work! We've already supported grounded-repvit-sam demo in Grounded-Segment-Anything !

jameslahm commented 11 months ago

Thank you! Grounded-Segment-Anything is awesome!

xiaobanni commented 11 months ago

@rentainhe Thank you for your excellent work on the Grounded-Segment-Anything project. However, I've noticed that the time cost depends heavily on the Grounding-Dino module, rather than the SAM module. As shown in the following picture, the MobileSAM only takes 0.05s, whereas the Grounding-Dino takes 1.70s, which is significantly longer. Do you have any plans to enhance the Grounding-Dino part, or is there an existing off-the-shelf solution?

image