microsoft / SoM

Set-of-Mark Prompting for GPT-4V and LMMs
MIT License
1.11k stars 87 forks source link

SoM works with open-sourced multimodal LM? #13

Closed j-min closed 7 months ago

j-min commented 10 months ago

Impressive work 👍

I'm interested if SoM works with open-sourced multimodal LMs such as LLaVA v1.5 as well. If you have tried this, can you share your experience on this?

quantaji commented 9 months ago

I have the same question too!

abrichr commented 8 months ago

There should not be any reason for the principles behind SoM to work with any LMM/VLM. This repository does not implement integrations with other models, but in principle it should be straightforward.