rustformers / llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
https://docs.rs/llm/latest/llm/
Apache License 2.0
6.06k stars 350 forks source link

EOS is not read from gguf format #446

Open Alisa-lisa opened 6 months ago

Alisa-lisa commented 6 months ago

I have discovered that running the same model with the same parameters from llm (gguf branch) and llama.cpp results in a different behavior. llm seems to have not been reading EOS token and thus the model creates output until max tokens is reached. Here is llama.cpp: llamares And the same model from llm: llm

According to discord "discussion" it might be indeed a bug.

philpax commented 6 months ago

Thanks for reporting this! For my own reference, the issue is that this doesn't get the EOT from the tokenizer - instead, it assumes that it's the hardcoded token </s>. This made sense in the early days of LLaMA, but is no longer true: https://github.com/rustformers/llm/blob/e61e5f9461d6c7a14455846bdeba13479e16f396/crates/models/llama/src/lib.rs#L373