DPO recipe saves a float32 model

huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences

https://huggingface.co/HuggingFaceH4

Apache License 2.0

4.2k stars 357 forks source link

Open tcapelle opened 4 months ago

tcapelle commented 4 months ago

Hello,

I have been using the Zephry DPO recipe and the models I get are save in float32. I am using config_full and accelerate multi_gpu.yaml

I think the issue is that the config_full has not setup the model as bfloat

Should this be changed?