THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B
Apache License 2.0
1.96k stars 126 forks source link

cannot understand using Meta-Llama-3-8B-Instruct as base model #161

Open dreaming12580 opened 1 month ago

dreaming12580 commented 1 month ago

I cannot understand using Meta-Llama-3-8B-Instruct as base model.

CogVLM2 Model use VisionExpertAttention module to self_attn, language_expert_query_key_value is one Linear in VisionExpertAttention .

Meta-Llama-3-8B-Instruct model use LlamaAttention Module to self_attn, q_proj, k_proj, v_proj, o_proj is four Linears in LlamaAttention. how can use Meta-Llama-3-8B-Instruct as base model ?

dreaming12580 commented 1 month ago

Matrix combination operations, got it