Closed XuRui314 closed 6 months ago
We believe Alpha-CLIP is already well aligned with CLIP space. In 4.2 experiment. the setting is as follows:
In the code level. As we adopt LLaVA training code, this second stage only involve:
All the other code is same as original LLaVA.
Thanks for sharing
Fine-tuning alpha-clip LLaVA-1.5 is mentioned in paper 4.2. Alpha-CLIP in MLLM. I wonder the detail settings. Do you train the model just following stage 1, stage 2 in Paper GPT4ROI?
And if i want to train a general version of Alpha-CLIP MLLM, can you provide some guidance for fine tuning? Should we completely follow the settings of LLaVA-1.5 to train MLP and LLM from scratch, or should we just fine-tune the MLP layer using part of the data to align the original CLIP space and Alpha-CLIP space?