When I used llama.cpp as the LMQL inference backend, I got the following error,
File "/Users/kaimary/micromamba/envs/dagi/lib/python3.10/site-packages/lmql/runtime/tokenizer.py", line 366, in tokenizer_not_found_error
raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier))
lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)
My testing code is simple and be reproduced 100%,
m = lmql.model("local:llama.cpp:.ggml-model-q4_0.gguf")
print(m.generate_sync("Hello", max_tokens=10))
When I used llama.cpp as the LMQL inference backend, I got the following error,
File "/Users/kaimary/micromamba/envs/dagi/lib/python3.10/site-packages/lmql/runtime/tokenizer.py", line 366, in tokenizer_not_found_error raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier)) lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)
My testing code is simple and be reproduced 100%,
m = lmql.model("local:llama.cpp:.ggml-model-q4_0.gguf")
print(m.generate_sync("Hello", max_tokens=10))
Any comment is appreciated.