.Net: Provide usage data for GetStreamingChatMessageContentsAsync when available in underlying SDK

microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps

https://aka.ms/semantic-kernel

MIT License

21.53k stars 3.17k forks source link

.Net: Provide usage data for GetStreamingChatMessageContentsAsync when available in underlying SDK #5660

Open SebastianStehle opened 6 months ago

SebastianStehle commented 6 months ago

Hi,

for the non-streaming method GetChatMessageContentsAsync I get the usage like this:

private decimal CalculateCosts(IReadOnlyList<ChatMessageContent> result)
{
    var costs = 0m;

    if (result[0].Metadata?.TryGetValue("Usage", out var m) == true && m is CompletionsUsage usage)
    {
        costs += usage.PromptTokens * options.PricePerInputTokenInEUR;
        costs += usage.CompletionTokens * options.PricePerOutputTokenInEUR;
    }

    return costs;
}

This works great. But for the streaming method, none of the items have a usage metadata right now. I am using OpenAI.

Originally posted by @SebastianStehle in https://github.com/microsoft/semantic-kernel/discussions/5624

matthewbolanos commented 6 months ago

Unfortunately, the underlying OpenAI APIs do not provide any usage data while streaming, so there is not any usage data for us to provide. In the future (likely post-Build), we'll provide out-of-the-box tokenizers that will automatically generate this missing telemetry.

stephentoub commented 6 months ago

In the future (likely post-Build), we'll provide out-of-the-box tokenizers that will automatically generate this missing telemetry.

For tiktoken, which is the tokenizer used by OpenAI for gpt-3.5-turbo and gpt-4, the Microsoft.ML.Tokenizers library now includes an implementation we recommend. https://www.nuget.org/packages/Microsoft.ML.Tokenizers/0.22.0-preview.24162.2

AdaTheDev commented 4 months ago

@matthewbolanos / @stephentoub OpenAI make the usage data available now for streaming chat (https://platform.openai.com/docs/api-reference/chat/streaming#chat/streaming-usage). Would this change the view on the intention of using Tokenizers/re-open this issue?

dmytrostruk commented 4 months ago

@AdaTheDev I think we will add usage data for streaming as soon as it will be available on Azure OpenAI .NET SDK side. I'm going to re-open this issue.