Closed chiwanpark closed 2 months ago
Hey, thanks for PR. Can you also add a check if the new chat template string is different? We don't want to warn if it's the same.
Could you also show some test logs to show when it properly appears/not?
@NanoCode012 Sorry for the late response. I've updated the patch to warn only if the overriding template is different from the one from tokenizer. Here is an example log when I apply chatml template to gemma-2-2b-it
model:
dP dP dP
88 88 88
.d8888b. dP. .dP .d8888b. 88 .d8888b. d8888P 88
88' `88 `8bd8' 88' `88 88 88' `88 88 88
88. .88 .d88b. 88. .88 88 88. .88 88 88
`88888P8 dP' `dP `88888P' dP `88888P' dP dP
****************************************
**** Axolotl Dependency Versions *****
accelerate: 0.32.0
peft: 0.12.0
transformers: 4.43.4
trl: 0.9.6
torch: 2.4.0+cu124
bitsandbytes: 0.43.3
****************************************
[2024-08-24 15:28:54,225] [INFO] [axolotl.utils.config.models.input.check_eval_packing:964] [PID:21491] [RANK:0] explicitly setting `eval_sample_packing` to match `sample_packing`
[2024-08-24 15:28:54,227] [DEBUG] [axolotl.normalize_config:83] [PID:21491] [RANK:0] bf16 support not detected, disabling for this configuration.
[2024-08-24 15:28:54,236] [INFO] [axolotl.normalize_config:183] [PID:21491] [RANK:0] GPU memory usage baseline: 0.000GB ()
[2024-08-24 15:28:56,078] [DEBUG] [axolotl.load_tokenizer:282] [PID:21491] [RANK:0] EOS: 107 / <end_of_turn>
[2024-08-24 15:28:56,078] [DEBUG] [axolotl.load_tokenizer:283] [PID:21491] [RANK:0] BOS: 2 / <bos>
[2024-08-24 15:28:56,078] [DEBUG] [axolotl.load_tokenizer:284] [PID:21491] [RANK:0] PAD: 0 / <pad>
[2024-08-24 15:28:56,078] [DEBUG] [axolotl.load_tokenizer:285] [PID:21491] [RANK:0] UNK: 3 / <unk>
[2024-08-24 15:28:56,079] [WARNING] [axolotl.load_tokenizer:295] [PID:21491] [RANK:0] The chat template is being overridden by the one in the config. This may cause a mismatch with the existing chat template in the base model. Please verify that this override is intended and compatible.
Hello, we're currently trying to get a PR #1732 in which should make this more explicit. For ex, setting chat_template: tokenizer_default
would use the tokenizer's setting. This way, if a user explicitly sets a different template like chatml
, we would believe that they understand what they're doing.
Would this be reasonable? If so, we can close this PR.
@NanoCode012 Yep, the approach with chat_template: tokenizer_default
would be better than this warning. I close this PR. :)
Description
This PR adds a warning message if we override the chat template in the tokenizer config with the one in the training config.
Motivation and Context
This PR resolves #1120.
How has this been tested?
Screenshots (if appropriate)
Types of changes
Social Handles (Optional)