magic-research / bubogpt

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs
https://bubo-gpt.github.io/
BSD 3-Clause "New" or "Revised" License
503 stars 35 forks source link

和MiniGPT4的区别是什么呢? #16

Open nkjulia opened 1 year ago

nkjulia commented 1 year ago

如题 从文章来看,相比MiniGPT4,在支持的模态上引入了音频维度,在LLM-Vicuna输出后增加了一个pipeline对齐实体在图像中的位置;