quarkiverse / quarkus-langchain4j

Quarkus Langchain4j extension
https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html
Apache License 2.0
134 stars 80 forks source link

Add secure-fraud-detection demo #539

Closed sberyozkin closed 3 months ago

sberyozkin commented 5 months ago

Draft PR for a secure-fraud-detection demo.

The startup issue described here earlier was resolved as advised by Georgios

geoand commented 5 months ago

Nice!

I am guessing that the error you are seeing is due to maven not being configured with to make javac retain method parameter names.

sberyozkin commented 5 months ago

Thanks @geoand, yes, I was wondering how exactly does the translate demo work which has 2 parameters, and now I've found https://github.com/quarkiverse/quarkus-langchain4j/blob/main/samples/cli-translator/pom.xml#L12 :-)

sberyozkin commented 4 months ago

@geoand @jmartisk I'm getting closer to finalizing the demo flow, a user accesses the login page where a Google authentication option is offered, user logs in and is redirected to the fraud detection entry page where the user can choose to check it by amount or distance. This in itself is useful enough to show how to use the authentication token, but I think the most interesting part is actually how to utilize the fact the OIDC session is available for correlating multiple fraud queries to get more interesting responses each time the same authenticated user tries to check the fraud status, so I'll play around next with the custom memory id idea. Once we are all happy with the demo flow, I'll fix README

sberyozkin commented 4 months ago

@geoand I prototyped a custom DefaultMemoryIdProvider and in this demo I'd like to do something simple, since no WebSockets are involved, like adding a boolean property indicating if the customer has already issued a given query before (not 100% sure how to do it yet :-), right now I'm just hoping the existing chat history can be correctly detected), and then later in the secure variation of the csv chat box demo, use the security identity info in the memory id to impact the SQL query.

But the following error is becoming quite persistent:

Resulted in: java.lang.RuntimeException: dev.ai4j.openai4j.OpenAiHttpException: {
  "error": {
    "message": "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.",
    "type": "invalid_request_error",
    "param": "messages.[1].role",
    "code": null
  }
}
    at dev.langchain4j.internal.RetryUtils$RetryPolicy.withRetry(RetryUtils.java:195)
    at dev.langchain4j.internal.RetryUtils.withRetry(RetryUtils.java:229)
    at dev.langchain4j.model.openai.OpenAiChatModel.generate(OpenAiChatModel.java:153)
    at dev.langchain4j.model.openai.OpenAiChatModel.generate(OpenAiChatModel.java:118)
    at dev.langchain4j.model.chat.ChatLanguageModel_XNMsOaekknG7BdNZ5YSUkjh1SqE_Synthetic_ClientProxy.generate(Unknown Source)
    at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.doImplement(AiServiceMethodImplementationSupport.java:228)
    at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.implement(AiServiceMethodImplementationSupport.java:97)
    at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl.detectAmountFraudForCustomer(Unknown Source)
    at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_Subclass.detectAmountFraudForCustomer$$superforward(Unknown Source)
    at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_Subclass$$function$$1.apply(Unknown Source)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
    at io.smallrye.faulttolerance.FaultToleranceInterceptor.lambda$syncFlow$3(FaultToleranceInterceptor.java:253)
    at io.smallrye.faulttolerance.core.InvocationContext.call(InvocationContext.java:20)
    at io.smallrye.faulttolerance.core.Invocation.apply(Invocation.java:29)
    at io.smallrye.faulttolerance.core.timeout.Timeout.doApply(Timeout.java:55)
    at io.smallrye.faulttolerance.core.timeout.Timeout.apply(Timeout.java:30)
    at io.smallrye.faulttolerance.FaultToleranceInterceptor.syncFlow(FaultToleranceInterceptor.java:255)
    at io.smallrye.faulttolerance.FaultToleranceInterceptor.intercept(FaultToleranceInterceptor.java:182)
    at io.smallrye.faulttolerance.FaultToleranceInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
    at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
    at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_Subclass.detectAmountFraudForCustomer(Unknown Source)
    at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_ClientProxy.detectAmountFraudForCustomer(Unknown Source)
    at io.quarkiverse.langchain4j.sample.FraudDetectionResource.detectBaseOnAmount(FraudDetectionResource.java:39)
    at io.quarkiverse.langchain4j.sample.FraudDetectionResource_Subclass.detectBaseOnAmount$$superforward(Unknown Source)
    at io.quarkiverse.langchain4j.sample.FraudDetectionResource_Subclass$$function$$2.apply(Unknown Source)

Can you please, when you have a few mins, have a quick look at FraudDetectionAi and both TransactionRepository and CustomerRepository, does something look wrong ? I've seen this error on the web, but I'm not sure yet where to start digging, thanks

sberyozkin commented 4 months ago

It is transient though, after some delay, I've restarted and I'm getting an interesting response, There are no recent transactions for the customer John Doe with the email john.doe@example.com., and then later I got a normal response. Probably related to the OpenAi session left hanging after some restarts, so may be some extra close is needed.

sberyozkin commented 4 months ago

In any case, I'm moving away from using tools in favor of ContentRetriever, yet to be implemented similarly to how it is done in the csv chatbot demo, since it gives a nice option to use Query.getMetadata().getMemoryId, but I'm keeping it as a separate commit in case I get stuck :-)

Update: I've followed with a few minor updates and squashed everything, it will be easy to get back to tools if necessary

jmartisk commented 4 months ago

Gonna look into this one in a bit!

sberyozkin commented 4 months ago

@geoand @jmartisk I think this is ready for review now, have a look please, if possible, test with Google.

Later, I'll follow up with hopefully a more advanced variation involving web sockets, once Quarkus 3.11 or later is used.

sberyozkin commented 4 months ago

The custom ContentRetriever has a request context activated to be able to read from Panache repositories.

I was hoping that may be I can add @Authenticated there as well and get SecurityIdentity injected, instead of using a custom memory id to capture a minimum session state in the form of the authenticated user's name and email, like

@ActivateRequestContext
@Authenticated
List<Content> retrieve(Query query) {}

But it fails with

jakarta.enterprise.context.ContextNotActiveException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=io.quarkus.vertx.http.runtime.CurrentVertxRequest, id=0_6n6EmChCiiDdd8HelptG_A0AE]
    - you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding
    at io.quarkus.arc.impl.ClientProxies.notActive(ClientProxies.java:70)
    at io.quarkus.arc.impl.ClientProxies.getSingleContextDelegate(ClientProxies.java:30)
    at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.arc$delegate(Unknown Source)
    at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.getOtherHttpContextObject(Unknown Source)
    at io.quarkus.resteasy.reactive.server.runtime.QuarkusCurrentRequest.get(QuarkusCurrentRequest.java:21)
    at org.jboss.resteasy.reactive.server.core.CurrentRequestManager.get(CurrentRequestManager.java:8)
    at io.quarkus.resteasy.reactive.server.runtime.StandardSecurityCheckInterceptor.intercept(StandardSecurityCheckInterceptor.java:39)
    at io.quarkus.resteasy.reactive.server.runtime.StandardSecurityCheckInterceptor_AuthenticatedInterceptor_Bean.intercept(Unknown Source)
    at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
    at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
    at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
    at io.quarkiverse.langchain4j.sample.FraudDetectionContentRetriever_Subclass.retrieve(Unknown Source)
    at io.quarkiverse.langchain4j.sample.FraudDetectionContentRetriever_ClientProxy.retrieve(Unknown Source)

@michalvavrik, does it ring any bell to you ?

It is strange because @ActivateRequestContext is there, it looks like it is lost during the context propagation...

So I guess for now, using a custom memory id is OK, in fact, it can make sense to combine eventually both @Authenticated and the custom memory id to compare the injected identity with the info from this memory id to assert within the retriever that the AI conversation is essentially bound to the current identity to avoid any chances of the retriever being called outside of the current security context. So I'd say the above exception is not a big problem for now as @Authenticated is already protecting the entry Login and FraudDetectionResource resources.

michalvavrik commented 4 months ago

It is strange because @ActivateRequestContext is there, it looks like it is lost during the context propagation...

StandardSecurityCheckInterceptor has priority PLATFORM_BEFORE while ActivateRequestContextInterceptor has PLATFORM_BEFORE + 100. Quarkus Security extension standard security interceptors have LIBRARY_BEFORE. As long as you fix this bug with anything lesser than LIBRARY_BEFORE, it will be fine. Sorry, I caused this (quite a long time ago now). You want to skip Quarkus Security interceptors so you need to run the interceptors in Quarkus REST before them (users complained security checks are repeated).

michalvavrik commented 4 months ago

So I guess for now, using a custom memory id is OK, in fact, it can make sense to combine eventually both @Authenticated and the custom memory id to compare the injected identity with the info from this memory id to assert within the retriever that the AI conversation is essentially bound to the current identity to avoid any chances of the retriever being called outside of the current security context. So I'd say the above exception is not a big problem for now as @Authenticated is already protecting the entry Login and FraudDetectionResource resources.

I know practically nothing about this project and code, but let me note that you will not be able to use standard security annotations on the FraudDetectionContentRetriever or down the stream because the exception you are getting means that CDI request context has not been activated by Quarkus REST (e.g. at that point, it's already deactivated for some reason and ActivateRequestContext will activate brand new context). Which means no SecurityIdentity unless I am missing something custom you do there.

sberyozkin commented 4 months ago

Thanks @michalvavrik for the feedback, getting to checking comments only now... OK, I'll open a Quarkus issue to consider what can be done with the priorities. I've only tried it to consider optimizing the way the custom identity representation is passed through the various Quarkus LangChain4j beans, for now using a custom memory id only is fine IMHO.

jmartisk commented 4 months ago

Ok I can get it working now. I'm not super happy about users having to manually update the Setup class to add a user that exactly matches their Google account. If one logs in as some other user, I get a NPE in CustomerRepository.getTransactionLimit because the user doesn't exist in the database (this error should definitely be handled better, it should show an error in the UI or something, or at least re-throw the exception with a meaningful message).

I'm thinking how to make it easier.. Could we maybe allow the user to specify an email address (and name) that should be registered during the Setup, as a config property? (If we decide to keep the current approach, I say it should at least be much more obvious from the readme that one has to update the class)

sberyozkin commented 4 months ago

Thanks @jmartisk, makes sense to make it easier for users to register emails, may be with a system property, will have a look next week

sberyozkin commented 3 months ago

Thanks @geoand,

Would it not be far easier to inject the logged in user in FraudDetectionRetrievalAugmentor?

Yeah, good point, I tried it earlier, to inject it directly to FraudDetectionContentRetriever but RequestContext carrying the security identity was not available.

I can try to inject it into a custom retrieval augmentor, but if that works, how would I pass it to the content retriever ?

geoand commented 3 months ago

Yeah, good point, I tried it earlier, to inject it directly to FraudDetectionContentRetriever but RequestContext carrying the security identity was not available.

You are right, I see why this is the case...

Unfortunately I don't have any good solutions for you at the moment, but I do see how we add something to LangChain4j to make this possible.

Essentially what I have in mind is to enhance dev.langchain4j.rag.query.Metadata with a Map<String, Object> where custom data can be added. Then we would also need a new concept like MetadataEnhancer which we could then use in Quarkus to put in whatever we want, in this case the user id or email or whatever.

Then your ContentRetriever could retrieve the information from query.getMetadata();

What I don't like about this is that it would force user code (the ContentRetriever to know specific keys).

An alternative approach would be for us to write our own version of DefaultRetrievalAugmentor which would not suffer from the problem of not passing the context (as it take steps to have that done properly).

WDYT @jmartisk @sberyozkin ?

sberyozkin commented 3 months ago

Essentially what I have in mind is to enhance dev.langchain4j.rag.query.Metadata with a Map<String, Object> where custom data can be added.

Sure, that can definitely do it, some well known key properties like io.quarkus.security.SecurityIdentity can be doc-ed

An alternative approach would be for us to write our own version of DefaultRetrievalAugmentor which would not suffer from the problem of not passing the context (as it take steps to have that done properly).

I'm not sure yet as I don't know much about the mechanics involved, on the basic level having an extra Metadata Map looks like it can work better as the identity may be needed across several pieces where Query is available, not only at the RAG level.

I was actually thinking about some workarounds on the way to the office :-) I was still thinking in terms of a custom memory id since the authenticated/logged in user's session lifetime matches the secure memory id lifetime. So what I had in mind is that the project introduces CompositeMemoryIdBean bean and now, the core code, before converting a provided memory id object to String, checks if this object has a marker like CompositeMemoryBeanExpected (or may be simply if it an instance of String or not - if not - wrapping is expected). And if the composite memory id bean is expected, then CompositeMemoryIdBean will wrap and carry the original custom memory id object around, its toString() method will be done the way it is done now in the core (memory id object string rep plus an AI service class/method name), and then whichever provider needs it, it casts a memory id object to CompositeMemoryIdBean after an instanceof check, and gets an original memoryId object as getMemoryId

May be it is a bit hacky, not sure, though it would probably fit closely enough the expectation that the memory id can be any object. I think I also like the idea of the extra metadata map.

geoand commented 3 months ago

I was actually thinking about some workarounds on the way to the office :-) I was still thinking in terms of a custom memory id since the authenticated/logged in user's session lifetime matches the secure memory id lifetime.

This is the one solution I would like to avoid because MemoryId is really a different concept. It's fine to populate it automatically, but it's not fine to use it's value to do something else.

sberyozkin commented 3 months ago

@geoand I'm happy enough to pursue adding extra Map to Metadata, I can probably suggest a PR to langchain4j, it looks it can be of general interest... How would we set the extra bits into this Map though at the quarkus-langchain4 level ?

geoand commented 3 months ago

it looks it can be of general interest.

Yeah, I think so to, but there also needs to be a way to add metadata, something like the MetadataEnhanced I suggested above.

sberyozkin commented 3 months ago

@geoand OK, then, it will certainly cover the RAG flow. Can you clarify please, when you get a sec, why RequestContext is not available in the custom augmentor or content retriever, like you said, that would definitely be a simplest solution, if it were available. (I'm also thinking about making the identity visible in other places like custom tools and other custom non-RAG related providers). I'm pretty sure your idea of adding an extra map to Metadata can be generally useful though, starting from it would be great, so I'll look into opening a PR in langchain4j

geoand commented 3 months ago

Can you clarify please, when you get a sec, why RequestContext is not available in the custom augmentor or content retrieve

Because the implementation in upstream LangChain4j uses a CompletableFuture and it's own executor in order to do calls in parallel. Now that I think about it, it probably makes sense to see updating DefaultRetrievalAugmentor to use our own Executor which we would feed with the one from context propagation works. Do you want to give that a shot?

geoand commented 3 months ago

Now that I think about it, it probably makes sense to see updating DefaultRetrievalAugmentor to use our own Executor which we would feed with the one from context propagation works. Do you want to give that a shot?

Actually, it should be possible to do this even now, no changes to LangChain4j needed.

sberyozkin commented 3 months ago

Hi @geoand, I was just dealing with formatting as well as a suggestion from @jmartisk to report something to the user when a customer is missing, and now a customer full name and email must be provided at startup as system properties, as opposed to having to modify Setup.java. Hopefully the user exp will be better. Jan, please have a look when you are back.

Super cool that we can try our own executor, I can test, how do I do it ? Is Quarkus executor injectable or which one do you have in mind ?

geoand commented 3 months ago

You can inject org.eclipse.microprofile.context.ManagedExecutor from context propagation and use that

sberyozkin commented 3 months ago

@geoand Now I'm really happy, the custom memory id provider has gone :-), secure authenticated access to the content retriever is done, thanks for the ManagedExecutor idea. I think it can probably make sense to doc it separately, I can work a bit later on a doc dedicated to securing various AI service parts...

I think this PR is now quite ready to go once yourself and @jmartisk happy enough with it in general

geoand commented 3 months ago

Very nice!

+1 on getting this in - but let's wait for @jmartisk

jmartisk commented 3 months ago

Guys I see you were discussing passing some extra metadata around a RAG pipeline, please check https://github.com/langchain4j/langchain4j/issues/1122, that is related

sberyozkin commented 3 months ago

@jmartisk

Guys I see you were discussing passing some extra metadata around a RAG pipeline, please check https://github.com/langchain4j/langchain4j/issues/1122, that is related

Thanks for the link, it is interesting, luckily, at least for the security identity, it can be injected into RAG related parts, but I'm not sure yet if it can be injected into Tools, something I can try in the next demo for the secure web socket based sql chatbot one, so that is indeed a very related discussion, thanks