Open garbidge opened 1 year ago
Thank you for your feedback. This has been routed to the support team for assistance.
I have been wondering the same for months. Tracking usage is trivially easy for the non-streaming version but seems impossible for streaming.
Dear all, any chance to have this feature? Thank you in advance
I would be needing this as well. Is there any workaround to access the token usage ? Thanks
For now it seems like the only feasible option is to count the token usage yourself.
In my (limited) experiments the combination of the following two methods has been 100% in line with the metrics i can see in azure portal for gpt-4.
Use this method to calculate the prompt tokens:
/// <summary>
/// Calculate the number of tokens that the messages would consume.
/// Based on: https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb
/// </summary>
/// <param name="messages">Messages to calculate token count for.</param>
/// <returns>Number of tokens</returns>
public int GetTokenCount(IEnumerable<Azure.AI.OpenAI.ChatMessage> messages)
{
const int TokensPerMessage = 3;
const int TokensPerRole = 1;
const int BaseTokens = 3;
var disallowedSpecial = new HashSet<string>();
var tokenCount = BaseTokens;
var encoding = SharpToken.GptEncoding.GetEncoding("cl100k_base");
foreach (var message in messages)
{
tokenCount += TokensPerMessage;
tokenCount += TokensPerRole;
tokenCount += encoding.Encode(message.Content, disallowedSpecial).Count;
}
return tokenCount;
}
And simply count the number of messages that you receive when consuming the response stream:
//...
OpenAIClient client = new(new Uri(endpoint), new AzureKeyCredential(key));
StreamingChatCompletions completions = await client.GetChatCompletionsStreamingAsync("gpt-4", input);
StreamingChatChoice choice = await completions.GetChoicesStreaming().FirstAsync();
int responseTokenCount = 0;
await foreach (var message in choice.GetMessageStreaming())
{
responseTokenCount++;
yield return message.Content;
}
//...
@felix-lausch The definitions for the schemas of tools and functions, as well as responses to these topics lacked.
Use SharpToken.GptEncoding.CountTokens
method. It is optimized for this case.
It is now possilbe in the offical Open AI API zu request token usage with streaming (https://community.openai.com/t/usage-stats-now-available-when-using-streaming-with-the-chat-completions-api-or-completions-api/738156) Can we implement this feature here as well?
Library name
Azure.AI.OpenAI
Please describe the feature.
From what I can see, there is no way to get the
CompletionsUsage
of a request when usingStreamingChatCompletions
. It hasprivate readonly IList<ChatCompletions> _baseChatCompletions;
but I don't see anywhere this is exposed.It would be nice if there was a way to check the token usage after streaming is complete.
(My apologies if I have missed somewhere that you can already do this when using
StreamingChatCompletions
)Ref: https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/openai/Azure.AI.OpenAI/src/Custom/StreamingChatCompletions.cs