Question on the Role of Text Encoder in Bunny MLLM Architecture

BAAI-DCAI / Bunny

A family of lightweight multimodal models.

Apache License 2.0

874 stars 66 forks source link

Question on the Role of Text Encoder in Bunny MLLM Architecture #108

Closed codefanw closed 1 month ago

codefanw commented 1 month ago

Thanks for your impressive work! The idea of integrating a text encoder before the large language model caught my attention. Could you share the specific benefits this approach provides? Also, were any ablation studies performed to quantify the impact of this component on overall performance?

Isaachhh commented 1 month ago

The text encoder in our architecture figure represents for the tokenizer of LLM.

codefanw commented 1 month ago

Alright, thank you for your answer!