magic-research / bubogpt

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
https://bubo-gpt.github.io/
BSD 3-Clause "New" or "Revised" License
502 stars 35 forks source link

About GPT-4 in match.py #5

Open HaisongDing opened 1 year ago

HaisongDing commented 1 year ago

I notice that you directly use OpenAI's GPT-4 to match caption and grounded entity. Why not train a custom model by leveraging existing datasets like the ones used in KOSMOS-2 or Shikra?

HaisongDing commented 1 year ago

Follow-up question, why not directly use GLIP or G-DINO?