InternLM / InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
1.91k stars 120 forks source link

About the CLIP(freeze/unfreeze? [CLS] token available in further dev?) in InternLM-XComposer2 #336

Open Coobiw opened 1 week ago

Coobiw commented 1 week ago

Hi, thanks for your great work! I want to know whether you unfreeze the CLIPVisionEnocder during all the training stage. I want to use the [CLS] token in CLIP for further developing. Is it reasonable in InternLM-XComposer2? Thanks for your reply!