eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.64k stars 197 forks source link

Llama 3 GGUF Tokenizer #350

Closed sashokbg closed 5 months ago

sashokbg commented 5 months ago

Hello, I want to test the new Llama 3 8B model locally but I am unable to make it run using the playground since I cannot find a suitable tokenizer.

I run my server like this:

lmql serve-model llama.cpp:/home/alexander/Games2/models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf \
  --cuda \
  --port 9999 \
  --n_ctx 4096 \
  --n_gpu_layers 35

and have the following in my playground

from 
    lmql.model("llama.cpp:/home/alexander/Games2/models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf",
    endpoint="localhost:9999")

But I get the error message that there is no tokenizer available:

File "/home/alexander/projects/lmql/.venv/lib/python3.11/site-packages/lmql/runtime/tokenizer.py", line 366, in tokenizer_not_found_error
    raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier))
lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)
App finished running with exit code 1

Any tips on what tokenizer should be used ?

ChairGraveyard commented 5 months ago

Passing it the huggingface ID for the regular (non GGUF/quantized) repo works to get the tokenizer. So for Meta-Llama-3-8B-Instruct you'd pass tokenizer="meta-llama/Meta-Llama-3-8B-Instruct" to your LMQL functions.

from 
    lmql.model("llama.cpp:/home/alexander/Games2/models/Meta-Llama-3-8B-Instruct.Q5_K_M.gguf",
    tokenizer="meta-llama/Meta-Llama-3-8B-Instruct",
    endpoint="localhost:9999")
sashokbg commented 5 months ago

Hello @ChairGraveyard I put the tokenizer as you proposed and also had to accept meta's license and put my huggingface token in ~/.cache/huggingface/token

Thank you for your help !

sashokbg commented 4 months ago

Hello, I am coming back to this issue to put some additional info. It is also required to install "transformers" dependency that allows lmql to download models from hugging face.

pip install transformers
EdwardSJ151 commented 4 months ago

~/.cache/huggingface/token

Hi, I don't have this token folder/file. It is a txt or something similar? How do I add this in?

sashokbg commented 4 months ago

Just create it and put inside your token from HF

https://huggingface.co/docs/hub/en/security-tokens