microsoft / kernel-memory

RAG architecture: index and query any data using LLM and natural language, track sources, show citations, asynchronous memory patterns.
https://microsoft.github.io/kernel-memory
MIT License
1.48k stars 282 forks source link

[Feature Request] Make WithCustomEmbeddingGenerator accepts type instead of concrete class #334

Closed alkampfergit closed 3 months ago

alkampfergit commented 6 months ago

Context / Scenario

I want to use a custom EmbeddingGenerator that depends on some services that are registered in Dependency INjection.

The problem

When I use WithCustomEmbeddingGenerator to use a custom embedding generator I can use a class that depends on some other services, so I'd like to register like

.WithCustomEmbeddingGenerator

Actually I need to create a concrete class so I need to manually resolve all services.

Proposed solution

using .WithCustomEmbeddingGenerator allows me to register the class so when the KernelMemory needs to use my custom embedding generator it will resolve with DependencyInjection mechanism.

Importance

nice to have

marcominerva commented 4 months ago

Analyzing the current implementation based on concrete types:

https://github.com/microsoft/kernel-memory/blob/775301a1cdd84ab7a54458d3fa453a8763c0744d/service/Abstractions/KernelMemoryBuilderExtensions.cs#L122-L140

https://github.com/microsoft/kernel-memory/blob/775301a1cdd84ab7a54458d3fa453a8763c0744d/service/Core/KernelMemoryBuilder.cs#L183-L187

I see that the same instance is added as Singleton for the retrieval (line 132) and to a list of embedding generators for the ingestion phase (line 137). Then, in the library, classes receive in constructors the ITextEmbeddingGenerator object for retrieval rather than the list of generators for the ingestion, based on the context.

So, handling both scenarios with the same interface in Dependency Injection require a little refactoring, because we need a single ITextEmbeddingGenerator for retrieval, but we can have multiple ITextEmbeddingGenerator (that could be different from the one used for retrieval) for ingestion.

I have made some investigations and I think we can handle this requirement using a Keyed Service for retrieval, while keeping using the classic registration for ingestion. In this way, we can get the retrieval ITextEmbeddingGenerator as keyed and keep getting the list of ingestion generators when needed.

I have created a little POC to showcase this scenario. @dluc, if you think this is a valuable feature, I can share the POC with you, before implementing the solution in Kernel Memory, so you can take a look at how I'm thinking to use keyed services.

NOTE: I have talked about ITextEmbeddingGenerator, but the same considerations apply also to IMemoryDb.

dluc commented 4 months ago

yes we need keyed services for LLMs for a bunch of scenarios, for embedding, text generation, chunking etc. Open to look at a PoC but it will take some time because I'd like the system also to support falling back to a default, and being easy to reuse across the project.

marcominerva commented 4 months ago

@dluc, I have made a draft PR to showcase my idea: https://github.com/microsoft/kernel-memory/pull/444. I know that is a structural change and requires a lot of effort, but perhaps it could be a starting point.

alkampfergit commented 4 months ago

In my extensions package I've used a similar structure of IHttpFactory, I'm registering all the needed interface as singleton, I can also used keyed registration.

Then I register one or more UserQuestionPipeline factory, and I simply add handler as standard interfaces, then I can resolve the factory and let it creates the concrete implementation.

It still needs some work but actually it works without too much problem

image

boreys commented 4 months ago

this is just what I need to solve current issue I am having, I use LLAMa for chat completion but still want to use OpenAI for embedding; the problem is when I change the url to Llama server, it assume there is an embedding endpoint to use just like OpenAI. can someone approve the PR quick? also, the example of how to use this new feature would be helpful too

dluc commented 4 months ago

this is just what I need to solve current issue I am having, I use LLAMa for chat completion but still want to use OpenAI for embedding; the problem is when I change the url to Llama server, it assume there is an embedding endpoint to use just like OpenAI. can someone approve the PR quick? also, the example of how to use this new feature would be helpful too

@boreys you should already be able to do that. There are examples in the repo mixing Llama and OpenAI, also OpenAI and LM Studio, e.g.

dluc commented 4 months ago

@dluc, I have made a draft PR to showcase my idea: dotnet/aspnetcore#444. I know that is a structural change and requires a lot of effort, but perhaps it could be a starting point.

IHttpClientFactory allows to link custom HttpClient instances "by key" and "by type" ("Typed Clients"). The typed client approach seems much better, e.g. not requiring global constants and not requiring to touch constructors, e.g. no need to reference IHttpClientFactory explicitly and no magic attributes. By contrast, the use of [FromKeyedServices(...)] makes the code quite ugly, it's very intrusive.

Hoping that will be addressed in future... if keyed services works only with string keys, I think we should at least avoid global constants, for instance using the class full name as a key.

Question: how are services resolved if a keyed dependency is missing? The framework should fall back to a default implementation, does it?

marcominerva commented 4 months ago

Question: how are services resolved if a keyed dependency is missing? The framework should fall back to a default implementation, does it?

Actually, I have noticed that, if I use the FromKeyedServices attribute in a class library and the keyed registration is missing, it falls back to the default dependency injection behavior and gets the last registered implementation. However, If I use FromKeyedServices directly in a Minimal API endpoint handler and the key does not exist, it throws an exception. So, it seems there are different behaviors based on the context. I have filled an issue of the ASP.NET Core repo to understand if it is a bug: https://github.com/dotnet/runtime/issues/102204.

In any case, once understood what it the correct behavior, if necessary it will be easy to provide a fall back to a default implementation.

dluc commented 4 months ago

Context / Scenario

I want to use a custom EmbeddingGenerator that depends on some services that are registered in Dependency INjection.

The problem

Actually I need to create a concrete class so I need to manually resolve all services.

have you tried using ActivatorUtilities ? example:

var kmBuilder = new KernelMemoryBuilder();

// my service registrations
kmBuilder.Services.Add...
kmBuilder.Services.Add...
kmBuilder.Services.Add...

// use my service registrations to instantiate MyEmbeddingGeneratorClass
kmBuilder.Services.AddSingleton<ITextEmbeddingGenerator>(serviceProvider =>
{
    return ActivatorUtilities.CreateInstance<MyEmbeddingGeneratorClass>(serviceProvider);
});