eth-sri / lmql

A language for constraint-guided and efficient LLM programming.
https://lmql.ai
Apache License 2.0
3.64k stars 197 forks source link

Azure OpenAI API instruct model not ERROR on query #346

Open Onebu opened 6 months ago

Onebu commented 6 months ago

Hi, I was just testing the azure OpenAI with the model "gpt35-instruct" model, which is a gpt3.5 instrcut model I have just deployed. But after setting up the model, when I was trying to make a simple query test, it shows this error:

/home/yb/.local/lib/python3.10/site-packages/lmql/runtime/bopenai/batched_openai.py:691: OpenAIAPIWarning: OpenAI: ("Setting 'echo' and 'logprobs' at the same time is not supported for this model. (after receiving 0 chunks. Current chunk time: 9.894371032714844e-05 Average chunk time: 0.0)", 'Stream duration:', 0.47087764739990234) "<class 'lmql.runtime.bopenai.openai_api.OpenAIStreamError'>"

But when I test the same query, using an openai gpt35 instruct model, it was functioning properly. Is this a problem of Azure API endpoint is different from the OpenAI version that they disabled the feature of using at the same time "logprob" and "echo"?

To configure the azure model, I am using the following code:

my_azure_model = lmql.model(
    "model_name", 
    api_type="azure",  
    api_base=os.getenv("OPENAI_API_BASE"), 
    api_key=os.getenv("OPENAI_API_KEY") , 
    api_version="2023-09-01-preview",
    tokenizer="openai/gpt-3.5-turbo-instruct",
    verbose=True
)
Onebu commented 6 months ago

I just checked the OpenAI Endpoint certification, which the echo is set to false by default. Is it possible that the implementation of Azure Endpoint are set by default to true so the query is failing? Or I am using the wrong configuration to set the azure model? Thank you in advance!

ddvlamin commented 2 months ago

I have the same issue. Does anyone know the cause or have a solution for this?

The openai_api.py file contains the following piece of code

# models that do not support 'logprobs' and 'echo' by OpenAI limitations

MODELS_WITHOUT_ECHO_LOGPROBS = [
    "gpt-3.5-turbo-instruct"
]

Not sure what it means, but by using gpt-3.5-turbo instead, the error seems to disappear in my case