Open maziyarpanahi opened 3 months ago
I think we'll want to change from our orpo implementation to the trl ORPOTrainer implementation.
This is interesting! Would love to help testing it if you have any work in progress for ORPOTrainer
?
Hi @winglian any updates? tell me if you need me to test anything?
oh, hmm, we already use the ORPOTrainer (https://github.com/axolotl-ai-cloud/axolotl/pull/1551/files), will need to dig into this a bit deeper
oh, hmm, we already use the ORPOTrainer (https://github.com/axolotl-ai-cloud/axolotl/pull/1551/files), will need to dig into this a bit deeper
Thanks a lot @winglian - is this something new? Should I try again? (my message is a few months old)
Please check that this issue hasn't been reported before.
Expected Behavior
I expect the ORPO works properly with FSDP and DeepSpeed on Qwen2 models.
Current behaviour
Currently, it's not possible to use ORPO via FSDP or DeepSpeed. It results in
Possible issues:
Steps to reproduce
winglian/axolotl-runpod:main-latest
template608a2f3
commit to avoid FSDP issue with the latest changes)Config yaml
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.10
axolotl branch-commit
608a2f3
Acknowledgements