eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.48k stars 191 forks source link

does Azure really work ? #305

Closed younes-io closed 5 months ago

younes-io commented 5 months ago

I followed the setup described in the docs like this:


import nest_asyncio
nest_asyncio.apply()

from dotenv import dotenv_values, load_dotenv

load_dotenv()

config=dotenv_values(".env")
open_ai_base = config['AZURE_OPENAI_API_BASE']
openai_api_key = config['AZURE_OPENAI_API_KEY']
openai_api_type = config['AZURE_OPENAI_API_TYPE']
openai_api_version = config['AZURE_OPENAI_API_VERSION']
model_name=config['MODEL_NAME']
embedding_model=config["EMBEDDING_MODEL"]

lmql_model = "openai/" + str(model_name)

print ("lmql_model ===> ", str(lmql_model))

import lmql

llm: lmql.LLM = lmql.model(
    # the name of your deployed model/engine, e.g. 'my-model'
    lmql_model,
    # "openai/gpt-3.5-turbo",
    # set to 'azure-chat' for chat endpoints and 'azure' for completion endpoints
    api_type=openai_api_type, 
    # your Azure API base URL, e.g. 'https://<YOUR_BASE>.openai.azure.com/'
    api_base=open_ai_base, 
    # your API key, can also be provided via env variable OPENAI_API_KEY 
    # or OPENAI_API_KEY_{<your-deployment-name>.upper()}
    api_key=openai_api_key, 
    # API version, defaults to '2023-05-15'
    api_version=openai_api_version,
    # tokenizer="tiktoken",
    # prints the full endpoint URL to stdout on each query (alternatively OPENAI_VERBOSE=1)
    verbose=False,
)

# simple generation
llm.generate_sync("Hello", max_tokens=10)

I get this error

TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'MY_DEPLOYMENT' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)

Versions: Python: 3.10.13 (on a Jupyter Notebook on VSCode) LMQL: 0.7.3


What did I do wrong ? Has Azure been tested with LMQL ? BTW, my deployment works when I do a Postman request... So I don't understand...

younes-io commented 5 months ago

@lbeurerkellner @Viehzeug @charles-dyfis-net : Could you please take a look at this as it's a blocking issue for me ? Also, I'd like to know if I misconfigured sth or if it's rly a bug. Thank you

EDIT:

BTW, I have these installed:

image

younes-io commented 5 months ago

I noticed something:

When I try this:

import lmql

llm: lmql.LLM = lmql.model(
    # the name of your deployed model/engine, e.g. 'my-model'
    "openai/gpt-3.5-turbo",
    api_type='azure-chat', 
    api_base=open_ai_base, 
    api_key=openai_api_key, 
    api_version=openai_api_version,
    verbose=True,
)

I get this in the logs:

Using Azure API endpoint: https://XXXXXXXXXX-XXX-XXX-XXXXXX.openai.azure.com/openai/deployments/gpt-3.5-turbo/chat/completions?api-version=2023-09-01-preview True
openai complete: {'model': 'gpt-3.5-turbo', 'max_tokens': 10, 'temperature': 0, 'user': 'lmql', 'stream': True, 'messages': [{'role': 'user', 'content': 'Hello'}]}

However, when I try with `` "openai/my-deployment-gpt-35", I get the error:

OSError: cma-cgm-gpt-35-turbo-sbx-ibm is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo either by logging in with `huggingface-cli login` or by passing `token=<your_token>`

TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'my-deployment-gpt-35' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)

It seems as if for LMQL, considers deployment and model to be the same... which is not the case in Azure

nx-chen02 commented 5 months ago

I've encountered the same issue, hoping someone could provide a resolution.

lbeurerkellner commented 5 months ago

Hi there. Can you try passing a tokenizer via tokenizer="openai/gpt-3.5-turbo" to the lmql.model constructor. From the error, the problem appears to be that no compatible tokenizer can be derived from the provided deployment name.

younes-io commented 5 months ago

:/ It works.. Can you please updated the docs ? it's been 2 days I couldn't infer that I should put the same value for the tokenizer tokenizer="openai/gpt-3.5-turbo"

Or let me know if I can do sth about it :)

Thank you!

lbeurerkellner commented 5 months ago

Great. I added a note to the docs here: https://lmql.ai/docs/latest/models/azure.html

younes-io commented 5 months ago

Awesome! Thank you @lbeurerkellner !

a-coles commented 5 months ago

Hi @lbeurerkellner , I'm hitting a similar issue. Given a lmql.model() setup of this form, trying to use Azure:

lmql.model(
            'openai/MY_MODEL',
            api_type='azure',
            api_base=MY_API_BASE,
            api_key=MY_API_KEY,
            api_version=MY_API_VERSION,
            verbose=True,
            tokenizer='openai/MY_MODEL',  # same as first argument
        )

When I run with this, I still get the error regarding failure to identify a good tokenizer:

lmql.runtime.tokenizer.TokenizerNotAvailableError: Failed to locate a suitable tokenizer implementation for 'MY_MODEL' (Make sure your current environment provides a tokenizer backend like 'transformers', 'tiktoken' or 'llama.cpp' for this model)

Could you advise on what is supposed to go in tokenizer in the lmql.model() call? Is it supposed to match the model name (as above)? Or is it suggested to hard-code it to openai/gpt-3.5-turbo? When I do the latter, I get an error about a missing api.env, possibly because that hard-coding suggests that I am trying to access OpenAI directly.

Thanks!