microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.87k stars 3.26k forks source link

.Net: Extend the ModelResult interface to track Max token exceeds #3119

Closed joowon-dm-snu closed 7 months ago

joowon-dm-snu commented 1 year ago

Currently, it's cumbersome to track when the creation is interrupted due to exceeding the max tokens. In C#, it would be helpful to include a finish_reason in the ChatModelResult so that developers can handle it when the max_token is exceeded.

// semantic-kernel/dotnet/src/Connectors/Connectors.AI.OpenAI/AzureSdk/ChatModelResult.cs
    internal ChatModelResult(ChatCompletions completionsData, ChatChoice choiceData)
    {
        this.Id = completionsData.Id;
        this.Created = completionsData.Created;
        this.PromptFilterResults = completionsData.PromptFilterResults;
        this.Choice = choiceData;
        this.Usage = completionsData.Usage;
        // Add this.FinishReason = completionsData.FinishReason
    }

image

While Python currently only returns a simple string, I hope it to be changed to return an ChatModelResult object, similar to C#. (If these need to be opened as separate issues, I will split them accordingly)

I am using the Python version and indirectly controlling it through the following method:

def is_output_length_exceeding_max_tokens(plugin: SKFunctionBase, output_str: str):
    try:
        encoder = tiktoken.encoding_for_model(plugin._ai_service._model_id)
    except KeyError:
        encoder = tiktoken.encoding_for_model("gpt-3.5-turbo")

    num_tokens = len(encoder.encode(output_str))
    return num_tokens >= plugin.request_settings.max_tokens
markwallace-microsoft commented 7 months ago

All .Net issues prior to 1-Dec-2023 are being closed. Please re-open, if this issue is still relevant to the .Net Semantic Kernel 1.x release. In the future all issues that are inactive for more than 90 days will be labelled as 'stale' and closed 14 days later.