dotnet / extensions

This repository contains a suite of libraries that provide facilities commonly needed when creating production-ready applications.
MIT License
2.68k stars 758 forks source link

Lower STJ dependency for M.E.AI to 8.0 #5538

Open eiriktsarpalis opened 1 month ago

eiriktsarpalis commented 1 month ago

Currently Microsoft.Extensions.AI.Abstraction requires System.Text.Json version 8.0 whereas Microsoft.Extensions.AI is on 9.0, primarily due to the fact that it's hosting JSON schema generation services and AIFunctionFactory. This implies that any leaf client looking to do schema generation of their own must necessarily:

  1. take a dependency on Microsoft.Extensions.AI and all its dependencies and transitively
  2. take a dependency on System.Text.Json v9.0

We should consider lowering STJ to v8.0 for the main M.E.AI project, which would necessitate porting over the schema generation code implemented in https://github.com/eiriktsarpalis/stj-schema-mapper. This provides a compatible implementation targeting STJ v8.0, however it achieves that using private reflection in a number of locations. This makes the STJ 8 compatible implementation problematic to use in the context of Native AOT unless the user has set IlcTrimMetadata to false. Regardless, this is the implementation currently in use by Semantic Kernel, and we should consider bringing the same implementation to M.E.AI.

eiriktsarpalis commented 1 month ago

cc @stephentoub @SteveSandersonMS @SergeyMenshykh

stephentoub commented 1 month ago

This implies that any leaf client looking to do schema generation

This is the crux of the issue, whether that's necessary or not.

Right now, AIFunctionFactory.Create embeds a schema into the AIFunction, so in general the LM client implementations needn't generate a schema. They'd need to if using an AIFunction that didn't carry a schema, or possibly if they wanted to guarantee that the JSO they're using is the same as the one used to create the schema. None of our three current reference implementations do so.

One option is to simply say that a leaf client should never need to because the AIFunction will always carry a schema. We'd make the schema property non-optional, and if it was of a type that LM client couldn't work with, state by convention it's fine for it to throw an exception or fall back to just degraded behaviors. In practice, it shouldn't be a problem for the schema to always be a part of the AIFunction, as the creator should always be able to define one given it knows what the function is.

If we believe it'll be important for LM clients to generate schemas, though, I wouldn't want them to do so by taking a dependency on M.E.AI; that brings with it too many other dependencies I don't want to force the LM client assemblies to need to take, like M.E.Caching.Abstractions, M.E.DependencyInjection.Abstractions, etc. We'd be saying that every consumer of any functionality would effectively need schema generation, at which point I'd want to push it down into M.E.AI.Abstractions. At that point we could likely remove the STJ 9.0 dependency from M.E.AI, anyway, but LM clients would only need to reference M.E.AI.Abstractions, not M.E.AI, which is one of the key reasons they're separated in the first place. And then, yes, we'd need to use the downlevel mapper implementation in M.E.AI.Abstractions for assets older than net9.0.

eiriktsarpalis commented 1 month ago

This implies that any leaf client looking to do schema generation

This is the crux of the issue, whether that's necessary or not.

Effectively, yes. Secondarily it's about supporting application developers who do not wish to consume the 9.0 STJ package or library components that for their own reason need to call into AIFunctionFactory and friends.

Right now, AIFunctionFactory.Create embeds a schema into the AIFunction, so in general the LM client implementations needn't generate a schema.

It doesn't just embed a schema, it actively wraps it into a JSON function. While we're somewhat forgiving in the types of arguments it is accepting, the output is always and invariably JsonElement. Assuming we wanted to pivot AIFunction back to being a multi-format, pure delegate abstraction, affording schema generation to leaf clients becomes essential. Alternatively, we could pivot fully towards it being a "web typed" function with AIFunctionFactory being fully responsible for filling in the gaps which wouldn't us require doing this.

If we believe it'll be important for LM clients to generate schemas, though, I wouldn't want them to do so by taking a dependency on M.E.AI; that brings with it too many other dependencies I don't want to force the LM client assemblies to need to take, like M.E.Caching.Abstractions, M.E.DependencyInjection.Abstractions, etc. We'd be saying that every consumer of any functionality would effectively need schema generation, at which point I'd want to push it down into M.E.AI.Abstractions. At that point we could likely remove the STJ 9.0 dependency from M.E.AI, anyway, but LM clients would only need to reference M.E.AI.Abstractions, not M.E.AI, which is one of the key reasons they're separated in the first place. And then, yes, we'd need to use the downlevel mapper implementation in M.E.AI.Abstractions for assets older than net9.0.

100%. Lowering the schema generator dependency affords us with flexibility w.r.t. its placement and moving it to M.E.AI.Abstractions makes total sense. For now it needs to stay in M.E.AI.

stephentoub commented 1 month ago

(Just for reference, all of the M.E.AI.* assemblies are now on the same plan as the rest of the repo, where only the net9.0 asset depends on other net9.0 assets. The M.E.AI assembly is the sole outlier.)