huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

Add `auto_insert_empty_system_msg` config flag #123

Closed BramVanroy closed 4 months ago

BramVanroy commented 4 months ago

Currently there is functionality to automatically insert an empty system message if system occur in the Jinja template.

https://github.com/huggingface/alignment-handbook/blob/87cc800498b17432cfb7f5acb5e9a79f15c867fc/src/alignment/data.py#L27-L38

This is not foolproof method: GEMMA (SFT) for instance has an explicit part in its chat template that says that if the role is "system" that an error must be raised. So this leads to a conflict with the alignment handbook. Therefore I suggest giving all control to the user but doing so in a backwards-compatible manner: by adding a flag that, if True, adds the system message with the same behavior as before, and when False no system message will be added. To this end, auto_insert_empty_system_msg is added as a DataArguments argument (defaulting to True).

Note that there is one small breaking change: the maybe_insert_system_message is currently implemented only for sft, generation and rm, but it was not used for dpo. There, a system message was ALWAYS added if the first message was not system, even if the Jinja template did not mention system. I changed that to be in line with the other tasks so the default behavior will be a litttle bit different when a tokenizer's chat template does not contain 'system' in DPO.

HuggingFaceDocBuilderDev commented 4 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

WeitaoLu commented 4 months ago

A small bug solution, if you encounter AttributeError: 'DataArguments' object has no attribute 'auto_insert_empty_system_msg' when running dpo , you can commit line 99 in run_dpo.py to disable this flag

"auto_insert_empty_system_msg": data_args.auto_insert_empty_system_msg,

BramVanroy commented 4 months ago

A small bug solution, if you encounter AttributeError: 'DataArguments' object has no attribute 'auto_insert_empty_system_msg' when running dpo , you can commit line 99 in run_dpo.py to disable this flag #"auto_insert_empty_system_msg": data_args.auto_insert_empty_system_msg,

If you get the attribute error you'll likely just need to update the repo to the newer version.