eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.61k stars 194 forks source link

[Question] How to use Yi models with local llama-cpp-python when there is no standard tokenizer #274

Open Rybens92 opened 10 months ago

Rybens92 commented 10 months ago

Mistral and Llama2 models are working nicely, but I cannot get Yi models to work with tokenization. I tried many repos with original Yi and also with finetunes.

Error I get:

raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier))
lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'ehartford/dolphin-2_2-yi-34b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)
reuank commented 10 months ago

Hey @Rybens92,

I haven't testet this exact configuration myself yet, but you can try to specify a tokenizer directly by using the tokenizer="ehartford/dolphin-2.2-yi-34b" option. Playground example:

argmax 
    "What is the capital of France? [RESPONSE]"
from 
    lmql.model("local:llama.cpp:/YOUR_PATH/dolphin-2_2-yi-34b.Q4_0.gguf", tokenizer="ehartford/dolphin-2.2-yi-34b")
where 
    len(TOKENS(RESPONSE)) < 20

Maybe that solves the issue.

Best Leon

Rybens92 commented 10 months ago

@reuank I did exactly that My code:

llama_llm = lmql.model(f"local:llama.cpp:models/{llama_model}", tokenizer=tokenizer, n_ctx=4096, n_gpu_layers=40, n_threads=24)

where

tokenizer = "ehartford/dolphin-2_2-yi-34b"

I tried also many other repos with Yi base model/finetunes, but same error.

reuank commented 10 months ago

You need to add the trust_remote_code=True option, as the YiTokenizer is not known to the tokenizer library by hf. This is also documented here: https://huggingface.co/ehartford/dolphin-2_2-yi-34b.

With this, the downloaded tokenizer should appear in your LMQL cache, which is by default located at ~/.cache/lmql.

Rybens92 commented 10 months ago

Can I do this with default lmql model loader? Because I tried with:

lmql.model(f"local:llama.cpp:models/{llama_model}", tokenizer=tokenizer, trust_remote_code=True, n_ctx=4096)

But error still occurs

I also tried modifying source code of tokenizer loading in lmql adding trust_remote_code=True and... same error too.

reuank commented 10 months ago

On my machine, the following example runs in the LMQL playground and produces sensible output:

argmax 
    "What is the capital of France? [RESPONSE]"
from 
    lmql.model("local:llama.cpp:/YOUR_PATH/dolphin-2_2-yi-34b.Q4_0.gguf", tokenizer="ehartford/dolphin-2_2-yi-34b", trust_remote_code=True)
where 
    len(TOKENS(RESPONSE)) < 20

In Python it would look something like this:

# test.py
import lmql

@lmql.query(
    model=lmql.model(
        "local:llama.cpp:/YOUR_PATH/dolphin-2_2-yi-34b.Q4_0.gguf", 
        tokenizer="ehartford/dolphin-2_2-yi-34b",
        trust_remote_code=True
    )
)
def prompt():
    '''lmql
    argmax
        "What is the capital of France? [RESPONSE]"
    where
        len(TOKENS(RESPONSE)) < 20
    '''

if __name__ == '__main__':
    print(prompt())

Note that the model card on hf seems to be incorrect. The correct tokenizer string is ehartford/dolphin-2_2-yi-34b, not ehartford/dolphin-2.2-yi-34b as stated there. I opened a pull request there to fix that.

Best Leon

Rybens92 commented 10 months ago

Still doesn't work for me. Setting lmql.model(..., trust_remote_code=True) made all tokenizers stop working on my end.

But I managed to find repo which contains standard tokenizer for Yi models: bhenrym14/platypus-yi-34b Use this repo for all Yi models At least it works for me

reuank commented 10 months ago

Okay, I cannot reproduce that, and know too little about the rest of your setup and the other changes you have made. Glad that you found something that works for you.

Within a fresh LMQL install, the query I posted above worked as it should. Maybe this helps other people who try to run that model in LMQL.

Best Leon