Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.
https://llamafile.ai
Other
20.71k stars 1.05k forks source link

Bug: llamafiler /tokenize endpoint with add_special does not add special tokens #643

Open k8si opened 4 days ago

k8si commented 4 days ago

Contact Details

ksilverstein@mozilla.com

What happened?

Summary: Using the llamafiler /tokenize endpoint does not seem to add special tokens when the corresponding flag is set to true, at least for llama-3.1-8b-instruct.

Model/system info:

Command used to start llamafiler:

#!/bin/bash

LLAMAFILER="./bin/llamafiler"
GGUF="meta-llama-3.1-8b-instruct.Q5_K_S.gguf"
"${LLAMAFILER}" --model "${GGUF}"

Curl to reproduce issue:

curl \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{"prompt": "The quick brown fox jumped over the lazy dog.", "add_special": true, "parse_special": false}' \
  "http://localhost:8080/tokenize"

Output:

{
  "add_special": true,
  "parse_special": false,
  "tokens": [
    "The",
    " quick",
    " brown",
    " fox",
    " jumped",
    " over",
    " the",
    " lazy",
    " dog",
    "."
  ]
}

For comparison, here is a script to do the same thing in python using the HuggingFace transformers library directly:

from transformers import AutoTokenizer, PreTrainedTokenizer

def main():
    model_name = "meta-llama/Llama-3.1-8B-Instruct"
    tokenizer: PreTrainedTokenizer = AutoTokenizer.from_pretrained(model_name)

    encoded = tokenizer(
        "The quick brown fox jumped over the lazy dog.",
        add_special_tokens=True
    )
    input_ids = encoded["input_ids"]

    print(tokenizer.convert_ids_to_tokens(input_ids))
    # ['<|begin_of_text|>', 'The', 'Ġquick', 'Ġbrown', 'Ġfox', 'Ġjumped', 'Ġover', 'Ġthe', 'Ġlazy', 'Ġdog', '.']

if __name__ == '__main__':
    main()

Output:

['<|begin_of_text|>', 'The', 'Ġquick', 'Ġbrown', 'Ġfox', 'Ġjumped', 'Ġover', 'Ġthe', 'Ġlazy', 'Ġdog', '.']

Version

llamafile v0.8.16 llamafiler v0.8.16 (but actually I built from source at commit e5c0921)

What operating system are you seeing the problem on?

Mac

Relevant log output

No response

jart commented 4 days ago

I can't reproduce this. Could you trying passing https://huggingface.co/Mozilla/Meta-Llama-3.1-8B-Instruct-llamafile/resolve/main/Meta-Llama-3.1-8B-Instruct.Q5_K_M.llamafile as the --model flag? It may be an issue with your GGUF file metadata.