issues
search
huggingface
/
alignment-handbook
Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.6k
stars
401
forks
source link
Code release
#11
Closed
lewtun
closed
11 months ago
lewtun
commented
11 months ago
TODO
[x] Add instructions on how to train Zephyr
[x] Add LoRA configs and QLoRA configs
[x] Evaluate Zephyr SFT & DPO models with MT-Bench
[x] Add unit tests or YOLO?
Closes #6
TODO
Closes #6