huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

How can I config `loss_type`? #87

Closed hahuyhoang411 closed 6 months ago

hahuyhoang411 commented 6 months ago

I want to change the loss_type into KTO or something else to test but I can't. Please show me the way. Thank you.

lewtun commented 6 months ago

Hi @hahuyhoang411 ! This can be done by passing loss_type to the DPOTrainer and will be fixed in the linked PR above

hahuyhoang411 commented 6 months ago

Great news. Thanks.

Anw do you want also to implement Unsloth? I saw TRL support Unsloth which could reduce the memory usage.