Training Stage - Githubissues

WisconsinAIVision / ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

https://vip-llava.github.io/

Apache License 2.0

282 stars 21 forks source link

Closed 980044579 closed 9 months ago

980044579 commented 9 months ago

Very exciting work, how long do “Visual Instruction Tuning“ and “Finetuning on GPT-4V Instruction Data“ take?