Closed KSemenenko closed 7 months ago
Thanks for raising this, @KSemenenko, we're going to discuss this a bit more internally before we decide on the best direction for solving this. We definitely agree this is something we should support.
@KSemenenko we are doing some work to allow the AI service and associated model request settings to be dynamically configured when a semantic function (aka LLM prompt) is executed. We will allow for multiple different model request settings to be configured for a prompt e.g. for service identified by an id you can set different request settings (max tokens, frequency penalty, ...). The model request settings can be for an OpenAI model or any arbitrary LLM.
I'd like to get more information on you use case. Consider the following:
Do you want to be able to say if the prompt token count is less then say 1000 tokens then use gpt-3.5-turbo and otherwise use gpt-3.5-turbo-16k?
@markwallace-microsoft yes, you are absolutely right, while the chat is small, there is no point in switching to 16k, especially if it costs the gpt4 model. for optimal use of the budget.
And I have another idea, maybe we can tell the planner which models he can use for specific functions-skills-plugins.
See for example, you have a model that you have fine-tuned (or maybe train) for some specific task, and it will work perfectly with a specific function, but for other things it is not so good.
Or, for example, a task such as summarization can always work on gpt3.5, although for ordinary tasks you will find gpt4.
what do you think?
@KSemenenko thanks for the feedback.
Could you take a look at this PR https://github.com/microsoft/semantic-kernel/pull/3040
It includes two examples:
Your feedback would be much appreciated.
I need to look into how to be able to specify the AI service for a plan. Will update here when I have that information.
I like the idea of ServiceId, it also looks well with IAIServiceSelector where you choose a model.
All .Net issues prior to 1-Dec-2023 are being closed. Please re-open, if this issue is still relevant to the .Net Semantic Kernel 1.x release. In the future all issues that are inactive for more than 90 days will be labelled as 'stale' and closed 14 days later.
Description: I am proposing a new feature that involves adding a parameter to specify the model's context size within our system's configuration. This enhancement aims to allow our Planner to make intelligent model selections based on the context size requirement of a particular task.
Cost Optimization: By having the flexibility to choose the model's context size, the Planner can utilize smaller, more cost-effective models such as ChatGPT3.5 4k for shorter tasks. This approach will lead to significant cost savings. Conversely, for tasks with a larger context, the Planner can seamlessly opt for models with a larger context size, like the 16k tokens.
Also maybe it’s possible to associate certain semantic functions with a specific model?