huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.2k stars 357 forks source link

Check that `default_chat_template` is also None #83

Closed nathan-az closed 6 months ago

nathan-az commented 6 months ago

The chat template is currently overwritten if the chat_template attribute is None. Some tokenizers are implemented using default_chat_template, and do not set chat_template. I noticed this in my use case with CodeLlama.

Bug report: https://github.com/huggingface/alignment-handbook/issues/84

nathan-az commented 6 months ago

Great catch, cheers! If you're open to writing a unit test against your codellama example, that would be great to ensure we don't accidentally introduce a regression in future :)

No worries @lewtun ! Unit test added.

The test includes a docstring for clarity, but none of the others do - if you prefer the code to be "self-documented" let me know and I can remove it.

Let me know if any other changes are needed before you can merge :)