基于图像提示的开集目标检测模型

IDEA-Research / GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

https://arxiv.org/abs/2303.05499

Apache License 2.0

6.83k stars 693 forks source link

Open Qia98 opened 3 months ago

Qia98 commented 3 months ago

GroundingDINO是一个非常棒的基于文本提示的开集目标检测工作，你们有想法探究一下基于图像提示的开集目标检测模型吗？目前MQ-Det是一个更接近的工作，他们采用文本提示 + 图像提示来做目标检测。

meaquanana commented 2 weeks ago

i have the same question, how to generate the heatmap on VLM such as Grounding dino or GLIP