eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.48k stars 191 forks source link

Difficulty getting started! #352

Open mcchung52 opened 2 months ago

mcchung52 commented 2 months ago

LMQL looks very promising (having played w/ Guidance) so I want to make this work but having issues from get go, trying to run it locally. I'm really hoping I can get some help.

IMMEDIATE GOAL: what is the simplest way to make this work?

Context: I have several gguf models in my comp that I want to run in my Macbook pro (pre-M, intel), basically via CPU which I ran previously many times in python code, though slow.

I want to: 1.run model directly in python code 2a.run model by exposing via api, like localhost:8081 2b.(can't in my mac but can in pc) run gguf via LM Studio and expose ip:port in PC and have python code in mac tap into it

Code:

import lmql

model_path = "/Users/mchung/Desktop/proj-ai/models/"
# model = "wizardcoder-python-13b-v1.0.Q4_K_S.gguf"
model = "codeqwen-1_5-7b-chat-q8_0.gguf"
# model = "mistral-7b-instruct-v0.2.Q5_K_M.gguf"

m = f"local:llama.cpp:{model_path+model}"
print(m)

@lmql.query(model=lmql.model(m, verbose=True))
def query_function():
    '''lmql
    """A great good dad joke. A indicates the punchline
    Q:[JOKE]
    A:[PUNCHLINE]""" where STOPS_AT(JOKE, "?") and \
                           STOPS_AT(PUNCHLINE, "\n")
    '''
    return "What's the best way to learn Python?"

response = query_function()
print(response)

Thanks in advance.

mcchung52 commented 2 months ago

for now, getting this error: raise TokenizerNotAvailableError("Failed to locate a suitable tokenizer implementation for '{}' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)".format(model_identifier)) lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'huggyllama/llama-7b' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)

sashokbg commented 1 month ago

Hello @mcchung52 I had similar issues. Please check this issue: #350

miqaP commented 1 month ago

Hi, I had the same error when trying to run a model using llama-cpp loader. It is not really clear when reading the documentation but to run model using llama-cpp you have to install the package lmql[hf] (instead of lmql) and llama-cpp-python to provide the inference backend.

Does it help ?