Closed hahuyhoang411 closed 6 months ago
Hi @hahuyhoang411 ! This can be done by passing loss_type
to the DPOTrainer
and will be fixed in the linked PR above
Great news. Thanks.
Anw do you want also to implement Unsloth? I saw TRL support Unsloth which could reduce the memory usage.
I want to change the loss_type into KTO or something else to test but I can't. Please show me the way. Thank you.