Open lewtun opened 2 months ago
Duplicate #1811, closing it in favour of this one
Hi, it looks like it'd be just the extension of the SUPPORTED_COMMANDS
constant?
as for instance orpo
would already be there https://github.com/huggingface/trl/blob/main/examples/scripts/orpo.py
I didn't check for the others but compared the scripts of kto
with orpo
for instance.
@grumpyp yes I think this is all one needs - a better solution would be to have an automated way to populate this constant by globbing all the _trainer.py
files (e.g. make cli_commands
). This way, anytime someone adds a new trainer, we automatically get support for it in the CLI :)
@grumpyp yes I think this is all one needs - a better solution would be to have an automated way to populate this constant by globbing all the
_trainer.py
files (e.g.make cli_commands
). This way, anytime someone adds a new trainer, we automatically get support for it in the CLI :)
yes definitely! If you want, assign the issue to me please. I'll try to get a PR out today.
assigned! thanks for the offer to help 🤗
hi @lewtun
I didn't want to manipulate py
-files via the Makefile so I went a slighly different approach.
It now creates the commands dynamically using a utility function which is cached. EDIT: I deleted the caching as it's not executed anywhere else and terminated after running so the cache is not saved.
Let me know if that works for you or if it needs changes. Thanks for the opportunity to contribute.
Feature request
The CLI currently supports training models with SFT/DPO/KTO: https://github.com/huggingface/trl/blob/6859e048da601fec181997a324e7b351fc997a33/trl/commands/cli.py#L24
It would be good to extend this support to all trainers so that we have a consistent API and also learn which parts of our scripts need refactoring to support this usage.
This could be tackled in separate PRs to keep things lightweight, and I'll track here the trainers in terms of priority to add (based on Hub usage):
Motivation
It is somewhat annoying that one cannot train a model through the CLI as this is helpful for fast debugging / iterations.
Your contribution
Happy to open PRs, but this could be a good first issue for new contributors!