Closed yezhengmao1 closed 5 months ago
@waitfor-night can you check this code about the dpo algorithm?
@waitfor-night I added the feature to load the trained LoRA adapter and freeze it when trained by DPO can you check and test it?
leave comments
@waitfor-night can you check this code about the dpo algorithm?