Open Rybens92 opened 10 months ago
Hey @Rybens92,
I haven't testet this exact configuration myself yet, but you can try to specify a tokenizer directly by using the tokenizer="ehartford/dolphin-2.2-yi-34b"
option. Playground example:
argmax
"What is the capital of France? [RESPONSE]"
from
lmql.model("local:llama.cpp:/YOUR_PATH/dolphin-2_2-yi-34b.Q4_0.gguf", tokenizer="ehartford/dolphin-2.2-yi-34b")
where
len(TOKENS(RESPONSE)) < 20
Maybe that solves the issue.
Best Leon
@reuank I did exactly that My code:
llama_llm = lmql.model(f"local:llama.cpp:models/{llama_model}", tokenizer=tokenizer, n_ctx=4096, n_gpu_layers=40, n_threads=24)
where
tokenizer = "ehartford/dolphin-2_2-yi-34b"
I tried also many other repos with Yi base model/finetunes, but same error.
You need to add the trust_remote_code=True
option, as the YiTokenizer is not known to the tokenizer
library by hf. This is also documented here: https://huggingface.co/ehartford/dolphin-2_2-yi-34b.
With this, the downloaded tokenizer should appear in your LMQL cache, which is by default located at ~/.cache/lmql
.
Can I do this with default lmql model loader? Because I tried with:
lmql.model(f"local:llama.cpp:models/{llama_model}", tokenizer=tokenizer, trust_remote_code=True, n_ctx=4096)
But error still occurs
I also tried modifying source code of tokenizer loading in lmql adding trust_remote_code=True
and... same error too.
On my machine, the following example runs in the LMQL playground and produces sensible output:
argmax
"What is the capital of France? [RESPONSE]"
from
lmql.model("local:llama.cpp:/YOUR_PATH/dolphin-2_2-yi-34b.Q4_0.gguf", tokenizer="ehartford/dolphin-2_2-yi-34b", trust_remote_code=True)
where
len(TOKENS(RESPONSE)) < 20
In Python it would look something like this:
# test.py
import lmql
@lmql.query(
model=lmql.model(
"local:llama.cpp:/YOUR_PATH/dolphin-2_2-yi-34b.Q4_0.gguf",
tokenizer="ehartford/dolphin-2_2-yi-34b",
trust_remote_code=True
)
)
def prompt():
'''lmql
argmax
"What is the capital of France? [RESPONSE]"
where
len(TOKENS(RESPONSE)) < 20
'''
if __name__ == '__main__':
print(prompt())
Note that the model card on hf seems to be incorrect. The correct tokenizer string is ehartford/dolphin-2_2-yi-34b
, not ehartford/dolphin-2.2-yi-34b
as stated there. I opened a pull request there to fix that.
Best Leon
Still doesn't work for me.
Setting lmql.model(..., trust_remote_code=True)
made all tokenizers stop working on my end.
But I managed to find repo which contains standard tokenizer for Yi models:
bhenrym14/platypus-yi-34b
Use this repo for all Yi models
At least it works for me
Okay, I cannot reproduce that, and know too little about the rest of your setup and the other changes you have made. Glad that you found something that works for you.
Within a fresh LMQL install, the query I posted above worked as it should. Maybe this helps other people who try to run that model in LMQL.
Best Leon
Mistral and Llama2 models are working nicely, but I cannot get Yi models to work with tokenization. I tried many repos with original Yi and also with finetunes.
Error I get: