WisconsinAIVision / ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
https://vip-llava.github.io/
Apache License 2.0
214 stars 15 forks source link

[Question] The checkpoint finetuned on RefCOCOg #19

Open zd11024 opened 1 week ago

zd11024 commented 1 week ago

Question

Hi,

Thanks for your excellent work! Do you have any plans to release the model fine-tuned on RefCOCOg?

mu-cai commented 1 week ago

Thanks! I will release it once I come back from CVPR!

mu-cai commented 3 days ago

The checkpoint is here: https://huggingface.co/mucai/vip-llava-7b-refcocog-ft