huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

Add check before inserting system message #106

Closed nathan-az closed 5 months ago

nathan-az commented 5 months ago

This should solve https://github.com/huggingface/alignment-handbook/issues/101. Currently an empty system message is always inserted if there is not one. Some tokenizers do not accept a system message.

Not sure this message is foolproof, but this checks the jinja template string for a mention of system before inserting. Includes unit test.

May also solve https://github.com/huggingface/alignment-handbook/issues/93 according to @Feynman27 but I have not confirmed.

HuggingFaceDocBuilderDev commented 5 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

dctanner commented 5 months ago

My SFT dataset includes a system message, and my chat_template adds this system message if it's not in the messages. However now we are inserting a blank system message. Should we not just let the chat_template insert the system message and set it to the correct message?

nathan-az commented 5 months ago

Interesting, I've never seen a chat template that added a system prompt itself, although I guess it would be simple enough.

Have you encountered any actual issues with the current code? As far as I can tell, this PR is just making it so the system message is only inserted if the chat template expects it, so it's just doing it ahead of time.