Reproduction of InternLM-XComposer2

TideDra / VL-RLHF

A RLHF Infrastructure for Vision-Language Models

Apache License 2.0

77 stars 4 forks source link

Hi,

Thank you for sharing a great work. I'm trying to reproduce the performance of InternLM-XComposer2 + DPO + VLFeedback, but I found that the baseline performance (InternLM-Xcomposer2-VL-7b) you reported is slightly different from the performance in the original paper. Can I know why? Also, the dpo_internlmxc2vl7b.sh file in ./scripts folder is the command for reproducing your InternLM-Xcomposer2-VL-7b-DPO model? If not, could you share the script file or config file to reproduce InternLM-Xcomposer2-VL-7b-DPO model. Thank you again for the nice work.

TideDra / VL-RLHF

Reproduction of InternLM-XComposer2 #9