InternLM / InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Apache License 2.0
2.45k stars 150 forks source link

想请问是否会开源DPO的训练代码 #388

Open chesiy opened 1 month ago

chesiy commented 1 month ago

如题。README中提到 IXC-2.5 leverages specially designed Chain-of-Thought (CoT) and Direct Preference Optimization (DPO) techniques to significantly enhance the quality of its written content. 想请问这部分DPO的代码能否会开源?

yuhangzang commented 1 month ago

Thanks for your interest in our work.

Open-source the DPO code needs to re-factor the code, e.g., the forward function of the DPO model is inconsistent with the SFT model.

We plan to release the official DPO code in our next version. Let's leave this issue open. When the DPO code is released I will reply and close this issue.