langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.51k stars 15.29k forks source link

Stream why don't support return usage_metadata #26145

Open lhdlhdlhdhh opened 2 months ago

lhdlhdlhdhh commented 2 months ago

Checked other resources

Example Code

    config = {"callbacks": [ChainCallbackHandler(trace_id)]}
    invoke_input = await self._compose_input(question, chat_history)
    async for chunk in langchain_chain.astream(invoke_input, config=config):
        chunk.usage_metadata

Error Message and Stack Trace (if applicable)

No response

Description

My pip package langchain==0.2.16. I had try use stream and astream, both didn't return the usage_metadata the field i need. But the openai api docs support return usage_metadata at the last return back. Please help. Tks

System Info

langchain 0.2.16 langchain-community 0.2.7 langchain-core 0.2.38 langchain-google-vertexai 1.0.6 langchain-openai 0.1.16 langchain-text-splitters 0.2.2 langsmith 0.1.96

keenborder786 commented 2 months ago

okay let me see and get back to you.

keenborder786 commented 2 months ago
ChatCompletionChunk(id='chatcmpl-A4Z3oYsSGkIC0cw0Pf9AGyQRkWIAJ', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1725651448, model='gpt-3.5-turbo-0125', object='chat.completion.chunk', system_fingerprint=None, usage=None)

Above is the official response from the following endpoint https://api.openai.com/v1/chat/completions and it does NOT output the usage, therefore the usage_metadata returned by langchain is also empty. You can also go here and see the response payload when streaming is enabled that it does NOT return the usage

PS: I am assuming you are using ChatOpenAI in your chain

lhdlhdlhdhh commented 2 months ago

Thanks your answer! I'm using AzureOpenAI. Below is my test code.

from openai import AzureOpenAI

client = AzureOpenAI(
    azure_endpoint="https://openai.azure.com",
    api_key="xx",
    api_version="2023-07-01-preview",
)

response = client.chat.completions.create(
    model="gpt4o",  # model = "deployment_name".
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
        {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
        {"role": "user", "content": "Do other Azure AI services support this too?"},
    ],
    stream=True,
    stream_options={"include_usage": True},
)

I had saw the params support stream_options, but it raise exception when i set stream_options. The error like this: openai.BadRequestError: Error code: 400 - {'error': {'message': "Unknown parameter: 'stream_options'.", 'type': 'invalid_request_error', 'param': 'stream_options', 'code': 'unknown_parameter'}}

lhdlhdlhdhh commented 1 month ago

I had try below code, sometimes it will be successed!

from openai.types.chat import ChatCompletionStreamOptionsParam
so = ChatCompletionStreamOptionsParam(include_usage=True)

response = client.chat.completions.create(
    model="gpt4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
        {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
        {"role": "user", "content": "Do other Azure AI services support this too?"},
    ],
    stream=True,
    stream_options=so,
)

for chunk in response:
    print(chunk, end="", flush=True)

The last two output content like this:

ChatCompletionChunk(id='chatcmpl-', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1725850685, model='gpt-4o-2024-05-13', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp', usage=None)
ChatCompletionChunk(id='chatcmpl-', choices=[], created=1725850685, model='gpt-4o-2024-05-13', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp', usage=CompletionUsage(completion_tokens=62, prompt_tokens=55, total_tokens=117))
keenborder786 commented 1 month ago

If it is returning usage then langchain do have the code in place where usage_metadata won't be null. Try running it couple of times with Langchain just like you did with straight with the API.

lhdlhdlhdhh commented 1 month ago

The same code, It only has a 50% success rate.