quarkiverse / quarkus-langchain4j

Quarkus Langchain4j extension
https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html
Apache License 2.0
120 stars 65 forks source link

Support CDI MemoryIdProvider #523

Closed sberyozkin closed 1 month ago

sberyozkin commented 2 months ago

Right now, the memory id can be supported with @MemoryId or it is set to the connection id with the WebSockets Next integration.

I'd like to be able to use the current SecurityIdentity instance. Or the current JWT token's unique subject claim value.

I'm not 100% sure what is the best way to do it. My proposal is to update Quarkus LangChain4j code which checks what memory id is if no @MemoryId is set, but before falling back to the WS connection id. Here it will check with Arc if the correct scope MemoryIdProvider is registered, if yes, get it and retrieve an instance of Quarkus LangChain4j MemoryIdentifier interface from it and use as a chat memory id, or to keep it simple, just return String.

For example, one demo can have a request scoped MemoryIdProvider registered which will have JsonWebToken injected and return its token subject claim value, ensuring a verified user identity subject value is used as a memory id.

How does it sound ? I'm happy to start looking at this enhancement request once it is agreed upon

geoand commented 2 months ago

I think this is a very interesting idea.

I do have a question however: do you expect this to behave any differently than how it does for Websockets or REST?

sberyozkin commented 2 months ago

As far as I understand, with REST, if @MemoryId is used, it means that somehow the client using Quarkus LangChain4j has to manage that memory id and pass it to Quarkus, and with WebSockets Next, this id is implicitly set up from the WS connection id.

If this provider is introduced then if the client is say HTML based client which does not use Web Sockets, and authenticates to Quarkus with Google, then every time the client accesses Quarkus, the memory id will be calculated from SecurityIdentity. It will also work exactly the same for Web Sockets Next once https://github.com/quarkusio/quarkus/issues/40312 is sorted out - if this provider is available, the security identity will get associated with the WS channel and instead of the connection id, it will be a security identity id, or may be a pair of the connection id and security identity id.

Did I understand your question correctly ?

geoand commented 2 months ago

As far as I understand, with REST, if @MemoryId is used, it means that somehow the client using Quarkus LangChain4j has to manage that memory id and pass it to Quarkus, and with WebSockets Next, this id is implicitly set up from the WS connectio

Actually with REST you have the same idea as with Websockets - i.e. we use the active equest itself

geoand commented 2 months ago

if this provider is available, the security identity will get associated with the WS channel and instead of the connection id, it will be a security identity id, or may be a pair of the connection id and security identity id.

Sure, but what is the advantage of doing that?

sberyozkin commented 2 months ago

I see, if it also works the similar way for both REST and WS, then, if the provider is available, then it is checked, if not fallback to the way Quarkus LangChain4j does it out of the box for REST and WebSockets. I can experiment with a chat bot demo which does not use WebSockets, as a POC.

Sure, but what is the advantage of doing that?

AFAIK, with Web Sockets, a connection id does not guarantee that it came from the user who started the WS upgrade, this is why Quarkus users want for example @RolesAllowed("admin") to work in the on-message WS calls, not only in the on-start calls.

But may be it will be much more relevant with just REST, you mentioned the memory id was also calculated similarly for REST, can you clarify how is it done to correlate things between multiple calls ? FYI, with the provider, the way it will work across calls is that for example an OIDC or Form authentication session cookie will help to create the security identity, etc.

I guess this provider, if introduced, may offer options for users to make more sophisticated correlations, even without the security in place

geoand commented 2 months ago

I think there is a misunderstanding here (which is totally reasonable). The fact that a specific ID is used is not strictly tied to how long memory is available. In your case, that means if that even if you did have something like the user id, it does not mean that you automatically have the memory of all the previous interactions. What is means is that we guarantee that the memory is available for the duration of the specific "request" (and remove the memory if the proper scope is used when that "request" terminates).

sberyozkin commented 2 months ago

In your case, that means if that even if you did have something like the user id, it does not mean that you automatically have the memory of all the previous interactions.

I see what I mean. But I think in the secure Quarkus appplication, the lifespan of the "request" (I'm assuming here that it involves everything for example in a given WS session) will match the security identity lifespan.

Let me consider this example. I authenticate to Quarkus via Keycloak and keep accessing Quarkus, for as long as the OIDC session is active, Quarkus will see the same user id and this very same user id will keep the user specific interactions keyed correctly for LangChain4j - this should be the case with or without WS being used. If the OIDC user logs out, the security event is sent and it can help to clean the corresponding memory (though I haven't though yet about the clean up specifics in the logout case).

It would be useful for tracing as well, which user asked what. Right now, even if I'm biased :-), I think it will be beneficial to associate the security related data with the memory id.

sberyozkin commented 2 months ago

Can you please clarify your earlier comment,

Actually with REST you have the same idea as with Websockets - i.e. we use the active equest itself

How does it work across multiple requests ?

sberyozkin commented 2 months ago

By the way, another reason I came up with this proposal, is that, after looking at the langchain4j code, I can see the memory id is passed with Metadata and it can probably impact the way the custom augmentors provide a user specific content. As long as we can assert/confirm with tests the the security identity name remains the same during a given ChatBot session, it should help with the user specific document selection.

geoand commented 2 months ago

Quarkus will see the same user id and this very same user id will keep the user specific interactions keyed correctly for LangChain4j - this should be the case with or without WS being used

This would only be the case if we knew the lifecycle of the user id. My point is that REST requests and WS requests work automatically because the AI service itself also has the proper scope (meaning we can clear out memory when it goes away).

geoand commented 2 months ago

How does it work across multiple requests ?

It does not, that's part of what I am failing to explain 😄. Our automatic memory handling does not ensure that subsequent requests from the same user see the previous memory items. If users want that, they need to handle memory manually (which is not too hard)

sberyozkin commented 2 months ago

Hi @geoand

I think I understand that the way @SessionScope works supports the correct memory management at that langchain4j level. My understanding the CDI session scope is bound to the lifecycle of the HTTP session.

I see how each user's chat bot session avoids clashing with another user's session in this case.

And it is exactly the same case for example, for a typical secure OIDC session, we don't depend there on the CDI session scope, manage it in a different way, but the idea is the same.

So getting back to your question, what is the advantage of trying to use a user name or some other security related identifier, which can work in in the latter case, with the memory id provider. I think it will let introduce the chat bot experience beyond the first entry, where the user has authenticated and authorized, where it is a user-specific, personalized experience. Take a DB SQL query example mentioned at the Insights call yesterday - if a principal name is alice@mydomain.com then a user specific SQL query can be run on the local DB to do the user specific RAG.

With @SessionScope alone, it works technically, but does not give an info about the current user. And I'm not sure how it aligns with request-scoped Quarkus security, what are security characteristics of such a case.

IMHO, it is worth giving it a try. Someone will do it sooner or later in some form, the only question IMHO is where it will be done.

Like I said, if agreed in principle, I can start planning a draft PR to show how it may work, and learn a few things along the way. I'm thinking of the following demo:

Something along these lines. The user logs out, the provider can catch a security event and clear the Chat memory somehow. Or may be CDI session scope can be used along the way since I expect the lifespan of this session scope and the OIDC session be more or less equal. This will have to be figured out.

I think all I'll need to get started is to know where I can put this provider check in a pure REST case, to start with, just before the automatic memory management decides on allocating the memory id itself.

I guess I may be still missing a few details :-), but some kind of association between the the ChatBot session and the current security identity will be useful IMHO.

Please think about it with the rest of the Quarkus langchain4j team :-), I can try to help if this enhancement request can be of interest...

geoand commented 2 months ago

Like I said, if agreed in principle, I can start planning a draft PR to show how it may work, and learn a few things along the way.

Sounds good to me!

sberyozkin commented 2 months ago

Thanks @geoand, I'll give it a try, make take me a bit of time to get something real produced with a few other issues in the mix, but I do commit to it. Cheers

geoand commented 2 months ago

👌

sberyozkin commented 2 months ago

At the moment I'm looking at the fraud detection demo which does not require a custom memory id, but once we resolve the WS-Next securty integration issue, it would work nicely with the secure variation of a demo like csv-chat-bot

sberyozkin commented 1 month ago

Hi @geoand I've found DefaultMemoryIdProvider and I can register a custom implementation in #539, with a low priority, and it picks up the current identity. So may be then this issue is invalid ? My only minor concern is that the security based one must go at the -1 priority to take over not only RequestScopeStateDefaultMemoryIdProvider but also WebSocketConnectionDefaultMemoryIdProvider which is at the default priority. But may be in the future, DefaultMemoryIdProvider can get discovered as a CDI bean as well.

But in any case, I'm not sure now MemoryIdProvider is needed if DefaultMemorIdProvider is already available, what do you think ?

sberyozkin commented 1 month ago

Let me work on updating #539 to show how one might want to use it

geoand commented 1 month ago

But in any case, I'm not sure now MemoryIdProvider is needed if DefaultMemorIdProvider is already available, what do you think ?

I am not sure exactly which classes you mean since we only have DefaultMemoryIdProvider :)

sberyozkin commented 1 month ago

I am not sure exactly which classes you mean since we only have DefaultMemoryIdProvider :)

I meant that I was thinking of adding a new interface, MemoryIdProvider, in this issue, but since DefaultMemoryIdProvider is available, duplicating it with another interface seems unnecessary.

sberyozkin commented 1 month ago

Yeah, I think I should close it for now, sorry I did not do my homework earlier :-)

geoand commented 1 month ago

NP!