Closed renmengjie7 closed 2 months ago
Hi, we will soon add a detailed readme on how to run the scripts.
To train a model using DPO:
export PYTHONPATH=$PYTHONPATH:$(pwd)
python src/dpo/run_dpo.py src/dpo/recipes/config_qlora_dpo_codellama_base.yaml
You can check for all the arguments in the config_qlora_dpo_codellama_base.yaml file. If you have your own model checkpoint locally, you can run:
python src/dpo/run_dpo.py src/dpo/recipes/config_qlora_dpo_codellama.yaml \
--model_name_or_path={path_to_your_model_checkpoint}
Make sure to change the chat_template
argument accordingly, depending on your model type/family :)
SFT training works similarly with the recipe file config_qlora_sft_codellama.yaml.
Thank you very much for your quick reply !
How to use your code in
src/dpo
to train model ? Can you provide an example script ?