langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.02k stars 14.64k forks source link

OpenAI callback is deceiving when used with Azure OpenAI #24324

Open mspronesti opened 1 month ago

mspronesti commented 1 month ago

Checked other resources

Description and Example Code

Langchain seemingly computes token usage and cost for both OpenAI and AzureOpenAI models using OpenAICallbackHandler. However, that relies on the fact that both the APIs retrieve the "complete" name of the called model, which is not the case in Azure OpenAI.

In my subscription I have 3 deployments of gpt-3.5-turbo corresponding to gpt-35-turbo-0613, gpt-35-turbo-0312, gpt-35-turbo-1106 and 2 deployments of gpt-4 corresponding to gpt-4-1106-preview and gpt-4-0613. However, when calling them for inference, the model is called, respectively gpt-35-turbo and gpt-4 regardless of the version. Langchain can't compute the correct cost then, despite no warning is thrown. This dictionary here also contains entries that would never be used because of the above, e.g. this one.

from langchain_openai import AzureChatOpenAI

llm1 = AzureChatOpenAI(
        api_version="2023-08-01-preview",
        azure_endpoint="https://YOUR_ENDPOINT.openai.azure.com/",
        api_key="YOUR_KEY",
        azure_deployment="gpt-35-turbo-0613",
        temperature=0,
)

llm2 = AzureChatOpenAI(
        api_version="2023-08-01-preview",
        azure_endpoint="https://YOUR_ENDPOINT.openai.azure.com/",
        api_key="YOUR_KEY",
        azure_deployment="gpt-35-turbo-0312",
        temperature=0,
)

messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]

llm1.invoke(messages).response_metadata['model_name'] # gpt-35-turbo
llm2.invoke(messages).response_metadata['model_name'] # gpt-35-turbo

System Info

Not applicable here.

keenborder786 commented 1 month ago

You need to use the parameter model_version to get the correct pricing information