npuichigo / openai_trtllm

OpenAI compatible API for TensorRT LLM triton backend
MIT License
177 stars 27 forks source link

llama 3 tokenizer no longer works - updated eos token #44

Open avianion opened 6 months ago

avianion commented 6 months ago

The official llama 3 70b instruct repo has updated the eos token

"eos_token": "<|eot_id|>",

Yet when using this library and using that eos token, no output is outputted because it used the old eos token.

Suggesting to fix this @npuichigo

npuichigo commented 6 months ago

which part do you mean? Triton backend should have parameter like stop_words

avianion commented 6 months ago

"chat_template": "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}",

poddamatt98 commented 3 months ago

Hi @avianion, I am reporting the issue you are describing. Using the liquid template defined in the repo, the model is returning an empty response. I also tried converting your chat template from Jinja to Liquid, but without results. I wonder if you have solved this issue. @npuichigo can you help us with this?

npuichigo commented 3 months ago

@poddamatt98 I will take a look at this when I have time.

poddamatt98 commented 3 months ago

problem solved modifying </s> with <|eot_id|> both in row 245 in src/routes/chat.rs and in row 226 in src/routes/completions.rs.