.Net: Rethinking SKContext and variable flow

stephentoub commented 1 year ago

[!IMPORTANT]
Labeled Urgent because this may require a breaking change. Please lower the priority if this is not the case.

SKContext represents ambient information that’s meant to be available to functions, like cancellation token, memory, and variables, and it serves as the vehicle by which that information is passed from one function to the next in a plan as well as out to any consumer of a function. However, the shape of SKFunction forces SKContext to be inserted before and after every function invocation, such that functions are limited by what SKContext can represent and store. All inputs need to be stored as strings and any output needs to be stored as strings. Further, functions themselves are responsible for interpreting those provided strings, which they may not have the relevant context to do, and are responsible for producing strings that others can understand. Fidelity of type is lost before that loss is actually necessary.

https://github.com/microsoft/semantic-kernel/pull/1195 takes several steps towards improving that, by allowing methods to be defined with arbitrary signatures (e.g. rather than being able to accept at most one parameter of type string, a function can accept any number of arguments of non-string types), pushing any logic for extracting data from the SKContext or putting data back into the SKContext out of the function's implementation. A function author therefore no longer needs to know anything about an SKContext; they write their method signature accepting the arguments they want and returning the data they want, as they would in any .NET application model, and it’s up to the caller to handle that appropriately. In this PR, it’s the SKFunction as the caller that then handles the mapping of inputs from SKContext to function arguments and from function results back to the SKContext, performing string/object conversions via TypeConverter.

But to support the kinds of things developers want and need to do, that is both needed progress and insufficient. Limiting function arguments and results to only what SKContext’s variables can store as strings (or be simply translated to/from strings that can be stored) still means that certain desirable shapes can’t be expressed, and forcing all inputs and outputs to go through SKContext limits how data can flow as well as how precisely the data can be represented. For example, a function should be definable to stream responses, in which case it’s return type should really be defined as something like IAsyncEnumerable<string>, with a native function able to use async/await/yield to easily implement streaming, and a semantic function able to implicitly stream the result from the LLM. Such a result should not be forced back into a string to be stored into the SKContext, as doing so obviates any benefits of streaming: the results would all need to be aggregated into a single string before returning the result to the caller. Or, for example, a caller should be able to populate an SKContext with data from the environment that's available in its original object form (e.g. a reference to a UI control), and a native function should be able to access that object.

I suggest an approach something like the following:

Remove all restrictions on what types functions can accept and return. Importing a skill function or defining a semantic function should work regardless of the number or types of inputs and outputs. Code can invoke a function, whether directly or indirectly via a kernel (if we want to wrap additional pre/post processing around its invocation), and the exact .NET objects passed as input are propagated into the function as its arguments, and the exact .NET object passed as output is propagated out to the caller as that same object. No translation is forced when there's no incompatibility.
Change SKContext’s variables to support arbitrary System.Object values instead of only System.String values (TrustAwareString, assuming it remains a required concept, either becomes just another object that can be stored, and all objects that are stored without the wrapper are implicitly untrusted, or TrustAwareString becomes TrustAwareObject so that arbitrary objects can also be marked as trusted, if that’s a desirable concept).
Push all required conversion to go between strings/objects to the orchestrator that requires such conversions and only when such conversions are actually needed. For example, if the output of function A is an IAsyncEnumerable<string>, the input of function B is an IAsyncEnumerable<string>, and the plan dictates that function A’s output be piped to function B, the orchestrator can just pass the IAsyncEnumerable<string> directly from one to the other (via SKContext if desirable, but that’s left up to the orchestrator); it needn’t do any translation. Or if the output of function A is an Int32, the input to function B is a string, and the plan dictates that the output of function A be piped to function B, then the orchestrator can then do a conversion from Int32 to string, using whatever mechanism is agreed upon for performing these translation (https://github.com/microsoft/semantic-kernel/pull/1195 uses TypeConverter, which is the same as what’s used by ASP.NET, for example, for converting between textual query string / form / cookie values and function inputs / outputs… similarly for conversions performed in EF, MAUI, etc.) The orchestrator itself can be parameterized with these conversions, e.g. a planner could be given a Func<IAsyncEnumerable<string>, Task<string>> that teaches it how to translate an IAsyncEnumerable<string> into a string, at which point it would support flowing the output of something that returned an IAsyncEnumerable<string> to something accepting a string, but without that would still be able to propagate the object when no conversion was required. The notion of what can be connected to what can be fed into the plan, based on all of the types and what’s compatible with what.

This removes artificial barriers. Functions are no longer forced to know anything about SKContext, and values maintain full fidelity in both content and type until it’s required by the consumer that a translation happens, at which point it’s up to that consumer to perform the translation (the consumer here being a direct invoker, the orchestor, etc.)

RogerBarreto commented 1 year ago

Adding also a reference to this issue #1298 which the approaches above might resolve.

markwallace-microsoft commented 1 year ago

Removing this for V1 RC1 as this requires prioritization and design

matthewbolanos commented 1 year ago

@stephentoub, when we meet later today, we should also discuss the comments you made during the last design review along the lines of... do we even need SKContext if the kernel is the "runtime."

matthewbolanos commented 1 year ago

@stephentoub, we'll likely need your help revisiting this issue to see what else is necessary to complete it now that several things have changed since this issue was created.

SergeyMenshykh commented 11 months ago

This task can be closed because everything here has already been completed within the scope of other tasks/issues.

microsoft / semantic-kernel

.Net: Rethinking SKContext and variable flow #1482