microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
MIT License
20.84k stars 3.03k forks source link

.Net OpenAI - Add Usage information for Streaming #6826

Open RogerBarreto opened 1 month ago

RogerBarreto commented 1 month ago

Recently OpenAI added a stream_options.include_usage = true parameter that when set provide one last chunk with the Usage information, this can be set on by default in our connector for text streaming APIs.

Last chunk info example

    "id": "chatcmpl-9bs3D3THDTOtsjYcokah40Ub",
    "object": "chat.completion.chunk",
    "created": 1718812935,
    "model": "gpt-4o-2024-05-13",
    "system_fingerprint": "fp_f4e629d0a5",
    "choices": [],
    "usage": {
        "prompt_tokens": 13,
        "completion_tokens": 7,
        "total_tokens": 20
arynaq commented 1 month ago

Ah I was going crazy, wondering why this was not available

var response = chat_completion.GetStreamingChatMessageContentsAsync(history, cancellationToken: cancellationToken);
        await foreach (var chunk in response)

            if (chunk.Metadata != null)
                var as_json = JsonSerializer.Serialize(chunk.Metadata, new JsonSerializerOptions { WriteIndented = true });
            if (chunk.Content == null)

            if (chunk.Content.Length > 0)
                yield return new ServerSideEvent("message", chunk.Content, Guid.NewGuid().ToString(), "1000");

Expected the usage metadata to be there, but this is not implemented yet, in the meantime is there any way we can get this in a streaming mode? It is quite important to track usage on a per-user model in our app.

AdaTheDev commented 2 weeks ago

@arynaq Don't believe there is - I ended up created a temporary wrapper connector around the official OpenAI connector, that then calculates the token usage manually using Tiktoken.