huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

DPO recipe saves a float32 model #121

Open tcapelle opened 4 months ago

tcapelle commented 4 months ago

Hello,

I have been using the Zephry DPO recipe and the models I get are save in float32. I am using config_full and accelerate multi_gpu.yaml

I think the issue is that the config_full has not setup the model as bfloat

Should this be changed?