Closed sberyozkin closed 3 months ago
Nice!
I am guessing that the error you are seeing is due to maven not being configured with to make javac
retain method parameter names.
Thanks @geoand, yes, I was wondering how exactly does the translate demo work which has 2 parameters, and now I've found https://github.com/quarkiverse/quarkus-langchain4j/blob/main/samples/cli-translator/pom.xml#L12 :-)
@geoand @jmartisk I'm getting closer to finalizing the demo flow, a user accesses the login page where a Google authentication option is offered, user logs in and is redirected to the fraud detection entry page where the user can choose to check it by amount or distance. This in itself is useful enough to show how to use the authentication token, but I think the most interesting part is actually how to utilize the fact the OIDC session is available for correlating multiple fraud queries to get more interesting responses each time the same authenticated user tries to check the fraud status, so I'll play around next with the custom memory id idea. Once we are all happy with the demo flow, I'll fix README
@geoand I prototyped a custom DefaultMemoryIdProvider
and in this demo I'd like to do something simple, since no WebSockets are involved, like adding a boolean property indicating if the customer has already issued a given query before (not 100% sure how to do it yet :-), right now I'm just hoping the existing chat history can be correctly detected), and then later in the secure variation of the csv chat box demo, use the security identity info in the memory id to impact the SQL query.
But the following error is becoming quite persistent:
Resulted in: java.lang.RuntimeException: dev.ai4j.openai4j.OpenAiHttpException: {
"error": {
"message": "Invalid parameter: messages with role 'tool' must be a response to a preceeding message with 'tool_calls'.",
"type": "invalid_request_error",
"param": "messages.[1].role",
"code": null
}
}
at dev.langchain4j.internal.RetryUtils$RetryPolicy.withRetry(RetryUtils.java:195)
at dev.langchain4j.internal.RetryUtils.withRetry(RetryUtils.java:229)
at dev.langchain4j.model.openai.OpenAiChatModel.generate(OpenAiChatModel.java:153)
at dev.langchain4j.model.openai.OpenAiChatModel.generate(OpenAiChatModel.java:118)
at dev.langchain4j.model.chat.ChatLanguageModel_XNMsOaekknG7BdNZ5YSUkjh1SqE_Synthetic_ClientProxy.generate(Unknown Source)
at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.doImplement(AiServiceMethodImplementationSupport.java:228)
at io.quarkiverse.langchain4j.runtime.aiservice.AiServiceMethodImplementationSupport.implement(AiServiceMethodImplementationSupport.java:97)
at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl.detectAmountFraudForCustomer(Unknown Source)
at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_Subclass.detectAmountFraudForCustomer$$superforward(Unknown Source)
at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_Subclass$$function$$1.apply(Unknown Source)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:73)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.proceed(AroundInvokeInvocationContext.java:62)
at io.smallrye.faulttolerance.FaultToleranceInterceptor.lambda$syncFlow$3(FaultToleranceInterceptor.java:253)
at io.smallrye.faulttolerance.core.InvocationContext.call(InvocationContext.java:20)
at io.smallrye.faulttolerance.core.Invocation.apply(Invocation.java:29)
at io.smallrye.faulttolerance.core.timeout.Timeout.doApply(Timeout.java:55)
at io.smallrye.faulttolerance.core.timeout.Timeout.apply(Timeout.java:30)
at io.smallrye.faulttolerance.FaultToleranceInterceptor.syncFlow(FaultToleranceInterceptor.java:255)
at io.smallrye.faulttolerance.FaultToleranceInterceptor.intercept(FaultToleranceInterceptor.java:182)
at io.smallrye.faulttolerance.FaultToleranceInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_Subclass.detectAmountFraudForCustomer(Unknown Source)
at io.quarkiverse.langchain4j.sample.FraudDetectionAi$$QuarkusImpl_ClientProxy.detectAmountFraudForCustomer(Unknown Source)
at io.quarkiverse.langchain4j.sample.FraudDetectionResource.detectBaseOnAmount(FraudDetectionResource.java:39)
at io.quarkiverse.langchain4j.sample.FraudDetectionResource_Subclass.detectBaseOnAmount$$superforward(Unknown Source)
at io.quarkiverse.langchain4j.sample.FraudDetectionResource_Subclass$$function$$2.apply(Unknown Source)
Can you please, when you have a few mins, have a quick look at FraudDetectionAi
and both TransactionRepository
and CustomerRepository
, does something look wrong ? I've seen this error on the web, but I'm not sure yet where to start digging, thanks
It is transient though, after some delay, I've restarted and I'm getting an interesting response, There are no recent transactions for the customer John Doe with the email john.doe@example.com.
, and then later I got a normal response.
Probably related to the OpenAi session left hanging after some restarts, so may be some extra close
is needed.
In any case, I'm moving away from using tools in favor of ContentRetriever, yet to be implemented similarly to how it is done in the csv chatbot demo, since it gives a nice option to use Query.getMetadata().getMemoryId
, but I'm keeping it as a separate commit in case I get stuck :-)
Update: I've followed with a few minor updates and squashed everything, it will be easy to get back to tools if necessary
Gonna look into this one in a bit!
@geoand @jmartisk I think this is ready for review now, have a look please, if possible, test with Google.
Later, I'll follow up with hopefully a more advanced variation involving web sockets, once Quarkus 3.11 or later is used.
The custom ContentRetriever
has a request context activated to be able to read from Panache repositories.
I was hoping that may be I can add @Authenticated
there as well and get SecurityIdentity
injected, instead of using a custom memory id to capture a minimum session state in the form of the authenticated user's name and email, like
@ActivateRequestContext
@Authenticated
List<Content> retrieve(Query query) {}
But it fails with
jakarta.enterprise.context.ContextNotActiveException: RequestScoped context was not active when trying to obtain a bean instance for a client proxy of CLASS bean [class=io.quarkus.vertx.http.runtime.CurrentVertxRequest, id=0_6n6EmChCiiDdd8HelptG_A0AE]
- you can activate the request context for a specific method using the @ActivateRequestContext interceptor binding
at io.quarkus.arc.impl.ClientProxies.notActive(ClientProxies.java:70)
at io.quarkus.arc.impl.ClientProxies.getSingleContextDelegate(ClientProxies.java:30)
at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.arc$delegate(Unknown Source)
at io.quarkus.vertx.http.runtime.CurrentVertxRequest_ClientProxy.getOtherHttpContextObject(Unknown Source)
at io.quarkus.resteasy.reactive.server.runtime.QuarkusCurrentRequest.get(QuarkusCurrentRequest.java:21)
at org.jboss.resteasy.reactive.server.core.CurrentRequestManager.get(CurrentRequestManager.java:8)
at io.quarkus.resteasy.reactive.server.runtime.StandardSecurityCheckInterceptor.intercept(StandardSecurityCheckInterceptor.java:39)
at io.quarkus.resteasy.reactive.server.runtime.StandardSecurityCheckInterceptor_AuthenticatedInterceptor_Bean.intercept(Unknown Source)
at io.quarkus.arc.impl.InterceptorInvocation.invoke(InterceptorInvocation.java:42)
at io.quarkus.arc.impl.AroundInvokeInvocationContext.perform(AroundInvokeInvocationContext.java:30)
at io.quarkus.arc.impl.InvocationContexts.performAroundInvoke(InvocationContexts.java:27)
at io.quarkiverse.langchain4j.sample.FraudDetectionContentRetriever_Subclass.retrieve(Unknown Source)
at io.quarkiverse.langchain4j.sample.FraudDetectionContentRetriever_ClientProxy.retrieve(Unknown Source)
@michalvavrik, does it ring any bell to you ?
It is strange because @ActivateRequestContext
is there, it looks like it is lost during the context propagation...
So I guess for now, using a custom memory id is OK, in fact, it can make sense to combine eventually both @Authenticated
and the custom memory id to compare the injected identity with the info from this memory id to assert within the retriever that the AI conversation is essentially bound to the current identity to avoid any chances of the retriever being called outside of the current security context. So I'd say the above exception is not a big problem for now as @Authenticated
is already protecting the entry Login
and FraudDetectionResource
resources.
It is strange because @ActivateRequestContext is there, it looks like it is lost during the context propagation...
StandardSecurityCheckInterceptor
has priority PLATFORM_BEFORE
while ActivateRequestContextInterceptor has PLATFORM_BEFORE + 100
. Quarkus Security extension standard security interceptors have LIBRARY_BEFORE
. As long as you fix this bug with anything lesser than LIBRARY_BEFORE
, it will be fine. Sorry, I caused this (quite a long time ago now). You want to skip Quarkus Security interceptors so you need to run the interceptors in Quarkus REST before them (users complained security checks are repeated).
So I guess for now, using a custom memory id is OK, in fact, it can make sense to combine eventually both @Authenticated and the custom memory id to compare the injected identity with the info from this memory id to assert within the retriever that the AI conversation is essentially bound to the current identity to avoid any chances of the retriever being called outside of the current security context. So I'd say the above exception is not a big problem for now as @Authenticated is already protecting the entry Login and FraudDetectionResource resources.
I know practically nothing about this project and code, but let me note that you will not be able to use standard security annotations on the FraudDetectionContentRetriever
or down the stream because the exception you are getting means that CDI request context has not been activated by Quarkus REST (e.g. at that point, it's already deactivated for some reason and ActivateRequestContext
will activate brand new context). Which means no SecurityIdentity
unless I am missing something custom you do there.
Thanks @michalvavrik for the feedback, getting to checking comments only now... OK, I'll open a Quarkus issue to consider what can be done with the priorities. I've only tried it to consider optimizing the way the custom identity representation is passed through the various Quarkus LangChain4j beans, for now using a custom memory id only is fine IMHO.
Ok I can get it working now. I'm not super happy about users having to manually update the Setup
class to add a user that exactly matches their Google account. If one logs in as some other user, I get a NPE in CustomerRepository.getTransactionLimit
because the user doesn't exist in the database (this error should definitely be handled better, it should show an error in the UI or something, or at least re-throw the exception with a meaningful message).
I'm thinking how to make it easier.. Could we maybe allow the user to specify an email address (and name) that should be registered during the Setup, as a config property? (If we decide to keep the current approach, I say it should at least be much more obvious from the readme that one has to update the class)
Thanks @jmartisk, makes sense to make it easier for users to register emails, may be with a system property, will have a look next week
Thanks @geoand,
Would it not be far easier to inject the logged in user in FraudDetectionRetrievalAugmentor?
Yeah, good point, I tried it earlier, to inject it directly to FraudDetectionContentRetriever
but RequestContext carrying the security identity was not available.
I can try to inject it into a custom retrieval augmentor, but if that works, how would I pass it to the content retriever ?
Yeah, good point, I tried it earlier, to inject it directly to FraudDetectionContentRetriever but RequestContext carrying the security identity was not available.
You are right, I see why this is the case...
Unfortunately I don't have any good solutions for you at the moment, but I do see how we add something to LangChain4j to make this possible.
Essentially what I have in mind is to enhance dev.langchain4j.rag.query.Metadata
with a Map<String, Object>
where custom data can be added.
Then we would also need a new concept like MetadataEnhancer
which we could then use in Quarkus to put in whatever we want, in this case the user id or email or whatever.
Then your ContentRetriever
could retrieve the information from query.getMetadata()
;
What I don't like about this is that it would force user code (the ContentRetriever
to know specific keys).
An alternative approach would be for us to write our own version of DefaultRetrievalAugmentor
which would not suffer from the problem of not passing the context (as it take steps to have that done properly).
WDYT @jmartisk @sberyozkin ?
Essentially what I have in mind is to enhance dev.langchain4j.rag.query.Metadata with a Map<String, Object> where custom data can be added.
Sure, that can definitely do it, some well known key properties like io.quarkus.security.SecurityIdentity
can be doc-ed
An alternative approach would be for us to write our own version of DefaultRetrievalAugmentor which would not suffer from the problem of not passing the context (as it take steps to have that done properly).
I'm not sure yet as I don't know much about the mechanics involved, on the basic level having an extra Metadata Map looks like it can work better as the identity may be needed across several pieces where Query is available, not only at the RAG level.
I was actually thinking about some workarounds on the way to the office :-) I was still thinking in terms of a custom memory id since the authenticated/logged in user's session lifetime matches the secure memory id lifetime.
So what I had in mind is that the project introduces CompositeMemoryIdBean
bean and now, the core code, before converting a provided memory id object to String, checks if this object has a marker like CompositeMemoryBeanExpected
(or may be simply if it an instance of String or not - if not - wrapping is expected). And if the composite memory id bean is expected, then CompositeMemoryIdBean will wrap and carry the original custom memory id object around, its toString()
method will be done the way it is done now in the core (memory id object string rep plus an AI service class/method name), and then whichever provider needs it, it casts a memory id object to CompositeMemoryIdBean
after an instanceof check, and gets an original memoryId object as getMemoryId
May be it is a bit hacky, not sure, though it would probably fit closely enough the expectation that the memory id can be any object. I think I also like the idea of the extra metadata map.
I was actually thinking about some workarounds on the way to the office :-) I was still thinking in terms of a custom memory id since the authenticated/logged in user's session lifetime matches the secure memory id lifetime.
This is the one solution I would like to avoid because MemoryId is really a different concept. It's fine to populate it automatically, but it's not fine to use it's value to do something else.
@geoand I'm happy enough to pursue adding extra Map to Metadata
, I can probably suggest a PR to langchain4j, it looks it can be of general interest... How would we set the extra bits into this Map though at the quarkus-langchain4 level ?
it looks it can be of general interest.
Yeah, I think so to, but there also needs to be a way to add metadata, something like the MetadataEnhanced
I suggested above.
@geoand OK, then, it will certainly cover the RAG flow. Can you clarify please, when you get a sec, why RequestContext is not available in the custom augmentor or content retriever, like you said, that would definitely be a simplest solution, if it were available. (I'm also thinking about making the identity visible in other places like custom tools and other custom non-RAG related providers). I'm pretty sure your idea of adding an extra map to Metadata can be generally useful though, starting from it would be great, so I'll look into opening a PR in langchain4j
Can you clarify please, when you get a sec, why RequestContext is not available in the custom augmentor or content retrieve
Because the implementation in upstream LangChain4j uses a CompletableFuture
and it's own executor in order to do calls in parallel.
Now that I think about it, it probably makes sense to see updating DefaultRetrievalAugmentor
to use our own Executor which we would feed with the one from context propagation works. Do you want to give that a shot?
Now that I think about it, it probably makes sense to see updating DefaultRetrievalAugmentor to use our own Executor which we would feed with the one from context propagation works. Do you want to give that a shot?
Actually, it should be possible to do this even now, no changes to LangChain4j needed.
Hi @geoand, I was just dealing with formatting as well as a suggestion from @jmartisk to report something to the user when a customer is missing, and now a customer full name and email must be provided at startup as system properties, as opposed to having to modify Setup.java
. Hopefully the user exp will be better. Jan, please have a look when you are back.
Super cool that we can try our own executor, I can test, how do I do it ? Is Quarkus executor injectable or which one do you have in mind ?
You can inject org.eclipse.microprofile.context.ManagedExecutor
from context propagation and use that
@geoand Now I'm really happy, the custom memory id provider has gone :-), secure authenticated access to the content retriever is done, thanks for the ManagedExecutor idea. I think it can probably make sense to doc it separately, I can work a bit later on a doc dedicated to securing various AI service parts...
I think this PR is now quite ready to go once yourself and @jmartisk happy enough with it in general
Very nice!
+1 on getting this in - but let's wait for @jmartisk
Guys I see you were discussing passing some extra metadata around a RAG pipeline, please check https://github.com/langchain4j/langchain4j/issues/1122, that is related
@jmartisk
Guys I see you were discussing passing some extra metadata around a RAG pipeline, please check https://github.com/langchain4j/langchain4j/issues/1122, that is related
Thanks for the link, it is interesting, luckily, at least for the security identity, it can be injected into RAG related parts, but I'm not sure yet if it can be injected into Tools, something I can try in the next demo for the secure web socket based sql chatbot one, so that is indeed a very related discussion, thanks
Draft PR for a secure-fraud-detection demo.
The startup issue described here earlier was resolved as advised by Georgios