Closed kevinhu closed 12 months ago
不涉及模型训练参数,跟训练集挂钩的设定,获取max token
It is also present in the llama 2 config: https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/6fdf2e60f86ff2481f2241aaee459f85b5b0bbb9/tokenizer_config.json#L22. I guess it is present to be compatible with other HuggingFace tokenizer. You can safely ignore it.
See https://huggingface.co/PY007/TinyLlama-1.1B-step-50K-105b/blob/main/tokenizer_config.json#L22