Open chesiy opened 1 month ago
Thanks for your interest in our work.
Open-source the DPO code needs to re-factor the code, e.g., the forward function of the DPO model is inconsistent with the SFT model.
We plan to release the official DPO code in our next version. Let's leave this issue open. When the DPO code is released I will reply and close this issue.
如题。README中提到 IXC-2.5 leverages specially designed Chain-of-Thought (CoT) and Direct Preference Optimization (DPO) techniques to significantly enhance the quality of its written content. 想请问这部分DPO的代码能否会开源?