henryzhongsc / longctx_bench

Official implementation for Yuan & Liu & Zhong et al., KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024
https://arxiv.org/abs/2407.01527
MIT License
46 stars 2 forks source link

Chat template missing error. #1

Open drewjin opened 4 days ago

drewjin commented 4 days ago

While running the code as a demo, I encountered a strange problem caused by chat template. code (in script model_utils.py)

# This is the customized building prompt for chat models
def build_chat(tokenizer: AutoTokenizer, prompt, chat_template):
    if chat_template is None:
        return prompt

    if "llama3" in chat_template.lower():
        messages = [
            {"role": "user", "content": prompt},
        ]
        prompt = tokenizer.apply_chat_template(
            conversation=messages, tokenize=False, add_generation_prompt=True
        )
    elif "longchat" in chat_template or "vicuna" in chat_template:
        from fastchat.model import get_conversation_template
        conv = get_conversation_template("vicuna")
        conv.append_message(conv.roles[0], prompt)
        conv.append_message(conv.roles[1], None)
        prompt = conv.get_prompt()
    elif "mistral" in chat_template.lower():
        messages = [
            {"role": "user", "content": prompt},
        ]
        prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    elif "recurrentgemma" in chat_template:
        messages = [
            {"role": "user", "content": prompt},
        ]
        prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    elif "mamba-chat" in chat_template:
        messages = [
            {"role": "user", "content": prompt},
        ]
        prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    elif "rwkv" in chat_template:
        prompt =  f"""User: hi

                Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

                User: {prompt}

                Assistant:"""
    else:
        logger.error(f"{chat_template} is unsupported.")
        raise NotImplementedError

    return prompt

Traceback:

Traceback (most recent call last):
  File "/home/jovyan/LLM_Workplace/longctx_bench/pipeline/baseline/main.py", line 65, in <module>
    processed_results, raw_results = eval_longbench(config)
                                     ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jovyan/LLM_Workplace/longctx_bench/pipeline/baseline/eval_longbench.py", line 58, in eval_longbench
    preds= get_pred(model, tokenizer, data, device, pipeline_params, eval_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jovyan/LLM_Workplace/longctx_bench/pipeline/baseline/eval_longbench.py", line 31, in get_pred
    prompt = build_chat(tokenizer, prompt, pipeline_params['chat_template'])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jovyan/LLM_Workplace/longctx_bench/pipeline/model_utils.py", line 20, in build_chat
    prompt = tokenizer.apply_chat_template(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jovyan/.conda/envs/llm/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1801, in apply_chat_template
    chat_template = self.get_chat_template(chat_template, tools)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jovyan/.conda/envs/llm/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1962, in get_chat_template
    raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating

environment:

transformers                      4.45.2

model I choose as the demo baseline:

Llama-3.1-7B
drewjin commented 4 days ago

Use Llama-3.1-8B-Instruct instead of Llama-3.1-8B

henryzhongsc commented 2 days ago

meta-llama/Llama-3.1-8B is a base model, so it doesn't have a tokenizer.apply_chat_template() function supported as the chat template is mostly enforced during instruction tuning. Should you want to use this model, please set chat_template: null in the pipeline config, and the input will return at L13 of pipeline/model_utils.py as a straight copy to avoid your posted error.

Alternatively, you can just use meta-llama/Llama-3.1-8B-Instruct as you already figured out here.

Note that Llama 3.1 has a much larger context window than Llama 3, so you might also want to adjust model_max_len in the pipeline config accordingly. The typical LongBench fashion is to make it max (so 128k) - 500 tokens — just a reminder if this is not too extra; and do let us know if there's more we can help.