xfactlab / orpo

Official repository for ORPO
Apache License 2.0
421 stars 39 forks source link

[Question] ORPO + SFTTrainer + QLora #10

Closed snassimr closed 7 months ago

snassimr commented 7 months ago

Hi ,

If it's posible to train ORPO with SFTTrainer and QLora ?

Thanks

nlee-208 commented 7 months ago

Hi @snassimr,

If you are referring to applying the SFTTrainer directly on ORPO, it wouldn't be feasible as ORPO simultaneously trains on the preference pair data via odds-ratio.

Relating to QLoRA, we have plans to integrate the PEFT methods in the near future but note that QLoRA may not work well with ZeRO Stage 3 or FSDP (related reddit). If you are in need of immediate implementation feel free to try the TRL ORPO implementation.