Closed edeandrea closed 10 months ago
The tests fail because the OPENAI_API_KEY
secret is not passed into builds performed by pull requests.
With the exception of
GITHUB_TOKEN
, secrets are not passed to the runner when a workflow is triggered from a forked repository.
HI @edeandrea thanks a lot! I like the idea of mocking/stubbing the interaction with the LLM, but I am not entirely sure that this client should be aware of this functionality. LangChain4j has lots of integrations with various LLM providers, I guess eventually users will want this feature for every model provider and it would be cumbersome to implement the same logic for each and every existing and new integration.
I am wondering (just thinking it out loud), since all models (e.g. OpenAiChatModel
, OllamaChatModel
, etc) extend the same interface (ChatLanguageModel
) and are interchangeable, we could create in LangChain4j something like MockedChatModel
which user could use instead of a real one depending on the environment or his/her custom logic.
We could even support mockito-like config style:
ChatLanguageModel model = MockedChatModel.builder()
.when(userMessage -> userMessage.contains("hello")).thenReturn("Hello, how can I help you today?")
.when(userMessage -> userMessage.contains("bye")).thenReturn("See you later, have a great day!")
.otherwiseReturn("This is a default answer")
.build()
This way it could be used instead of any model provider, not only OpenAI. User could have MockedChatModel
as a bean in his app configuration for a dev/test profile and a real OpenAiChatModel
for a prod profile.
Would love to hear your thought on this.
Hi there! Thanks for the note!
So, I did have the same thought as you with regards to
LangChain4j has lots of integrations with various LLM providers, I guess eventually users will want this feature for every model provider and it would be cumbersome to implement the same logic for each and every existing and new integration.
But once I started thinking down that route my thought is that this "functionality" or capability isn't a property of the model. It's really a property of the communication mechanism.
Because of that, I was also trying to shield Langchain from having to know anything about "what to do" in the situation. Leave it up to the caller to handle it. I think it would be hard (if not impossible) for the library (or even extensions of the library - like spring boot or quarkus) to be able to predict all the potential use cases. These libraries create the models as beans and inject them into the application. At that point in time the libraries wouldn't know how the library is being used. Having a blanket "don't make the actual request and let the caller decide what to do in that moment" I thought was a better approach.
In Quarkus/Spring Boot, for example, the application would couple it with fault tolerance (fallbacks/etc) which caught the exception and did something with it. Other consumers could try/catch or something else around the point in which they are calling it.
Now, for testing purposes, a fully implemented mocking framework might be really valuable. Currently we use wiremock to stub the interactions, but that unfortunately isn't a trick you can use when deploying an application into a "real" environment where perhaps you don't want to make real calls to a provider.
Just my thought process.
I'll also add that I'm still new to this library, as well as langchain4j, so I may not fully understand all the implementation details (yet - I'm getting there!). I've just learned that each model class actually contains its own client instance (thanks @geoand !).
So my previous comment
But once I started thinking down that route my thought is that this "functionality" or capability isn't a property of the model. It's really a property of the communication mechanism.
Still somewhat holds true from a theoretical perspective, but because each model contains its own client then maybe this property does belong on the model itself.
Maybe adding a layer of inheritance that all of the models extend for things that might be common?
There are also things like EmbeddingModel
, ImageModel
, LanguageModel
, and ModerationModel
(as well as their streaming equivalents, where they exist).
Hi there! Thanks for the note!
So, I did have the same thought as you with regards to
LangChain4j has lots of integrations with various LLM providers, I guess eventually users will want this feature for every model provider and it would be cumbersome to implement the same logic for each and every existing and new integration.
But once I started thinking down that route my thought is that this "functionality" or capability isn't a property of the model. It's really a property of the communication mechanism.
It is not a property of the model (e.g. OpenAiChatModel
), what I meant is that it is a completely new and separate type of model (mock).
I am not sure I understand why this is a property of the communication mechanism, could you please explain? (I am not saying it is wrong, I genuinely trying to understand your point)
Because of that, I was also trying to shield Langchain from having to know anything about "what to do" in the situation. Leave it up to the caller to handle it. I think it would be hard (if not impossible) for the library (or even extensions of the library - like spring boot or quarkus) to be able to predict all the potential use cases. These libraries create the models as beans and inject them into the application. At that point in time the libraries wouldn't know how the library is being used. Having a blanket "don't make the actual request and let the caller decide what to do in that moment" I thought was a better approach.
LangChain4j does not need to know about this "disable live requests" functionality, there is a ChatLanguageModel
interface in langchain4j-core module which anyone can implement and plug it into LangChain4j chains/ai services.
One of such implementations can be ChatModelMock
which we could provide out of the box, but this will not in any way change LangChain4j or make it "aware" of this functionality.
In Quarkus/Spring Boot, for example, the application would couple it with fault tolerance (fallbacks/etc) which caught the exception and did something with it. Other consumers could try/catch or something else around the point in which they are calling it.
Hm, is it common to use fault tolerance mechanisms for such cases?
Now, for testing purposes, a fully implemented mocking framework might be really valuable. Currently we use wiremock to stub the interactions, but that unfortunately isn't a trick you can use when deploying an application into a "real" environment where perhaps you don't want to make real calls to a provider.
Ah, so you want to use it not only during integration testing, but in "real" envs as well? I did not get this from beginning, sorry for confusion.
Now I am wondering, If one wants to be able to enable/disable live interaction via config properties, we could provide a proxy class like SwitchableChatModel that would wrap a delegate ChatLanguageModel
instance and accept a boolean enabled
and possibly String defaultResponse
properties. This way you could enable/disable any model, not only OpenAI, and OpenAI client stays agnostic of this. I am not sure about this though, user would need to manage this properties himself... just random thoughts.
I appreciate the conversation!
It is not a property of the model (e.g.
OpenAiChatModel
), what I meant is that it is a completely new and separate type of model (mock). I am not sure I understand why this is a property of the communication mechanism, could you please explain? (I am not saying it is wrong, I genuinely trying to understand your point)
I don't think it can be a completely separate type of model. It needs to be the same model that is already being used and/or extended by other downstream libraries/frameworks (like Quarkus or Spring).
This goes to the other comment about not just using this during testing. There may be times where this is used in "real" envs. I'm already doing this today in an application and I've invented my own application-specific way of dealing with it. We are looking to enhance this application though and I'll have to repeat the same pattern. I feel like it makes sense for the behavior to be part of the library itself. We're also planning on extending its usage to other applications and will have to apply the same pattern.
Hm, is it common to use fault tolerance mechanisms for such cases?
I'm not sure how common it is for this particular use case, but for fault tolerance in general it is. The Quarkus extension recommends it (https://docs.quarkiverse.io/quarkus-langchain4j/dev/fault-tolerance.html).
With the approach I took (throwing an exception) in this case, fault tolerance (or try/catch in the caller) would be the way you would deal with it, if you wanted to use this functionality.
With an approach like
ChatLanguageModel model = MockedChatModel.builder()
.when(userMessage -> userMessage.contains("hello")).thenReturn("Hello, how can I help you today?")
.when(userMessage -> userMessage.contains("bye")).thenReturn("See you later, have a great day!")
.otherwiseReturn("This is a default answer")
.build()
I'm not sure how you would define that in externalized configuration (like a properties file or environment variable) that was passed into an application. Something like that would require code modification if it was to be able to be used during application runtime (not just testing).
Ah, so you want to use it not only during integration testing, but in "real" envs as well? I did not get this from beginning, sorry for confusion.
Yes thats where I am going with this
Now I am wondering, If one wants to be able to enable/disable live interaction via config properties, we could provide a proxy class like SwitchableChatModel that would wrap a delegate
ChatLanguageModel
instance and accept aboolean enabled
and possiblyString defaultResponse
properties. This way you could enable/disable any model, not only OpenAI, and OpenAI client stays agnostic of this. I am not sure about this though, user would need to manage this properties himself... just random thoughts.
Wouldn't downstream libraries then have to make changes to extend this model? Also, having a String defaultResponse
property wouldn't really work for the other kinds of models (images/etc). Thats part of the reason I was trying to leave out having this library (or Langchain4j, or any other library which derives from it, now or in the future) from having to attempt to manage what the "rules" are (i.e. under this condition return that, or always return this response). Leave that to the application that is implementing the capability as it is specific to their needs, and potentially on a deployment-by-deployment basis in different environments.
I have a slightly different idea which I would like to share.
Essentially what I see coming up a lot from our side (i.e. a framework consuming the library and trying to provide additional functionality on top of it), is that if one needs to configure something on any LLM client (be it OpenAI, Azure or anything else), then the necessary properties need to be added to the builder of a model (like the ChatLanguageModel
).
So what if we did something like:
public abstract static class Builder<T extends OpenAiClient, B extends Builder<T, B>> { // this is the builder inside OpenAiClient
// add this field
public Function<OpenAiClient.Builder<T, B>, OpenAiClient.Builder<T, B>> customizer; // generics are terrible here, but it's only meant for advanced integrations anyway
// add these methods
public B customizer(Function<OpenAiClient.Builder<T, B>, OpenAiClient.Builder<T, B>> customizer) {
this.customizer = customizer;
return (B) this;
}
protected B getEffectiveBuilder() {
OpenAiClient.Builder<T, B> effectiveBuilder = this;
if (customizer != null) {
effectiveBuilder = customizer.apply(effectiveBuilder);
}
return (B) effectiveBuilder;
}
}
and then in the implementations of the builder (in DefaultOpenAiClient
and the Quarkus specific one) would simple do:
public DefaultOpenAiClient build() {
return new DefaultOpenAiClient(getEffectiveBuilder());
}
The plus side here is that this allows advanced integrations to do whatever they want with the builder without affecting normal integrations at all, meaning that what @edeandrea is trying to do could easily be accomplished on the Quarkus side by providing the proper customizer.
What do you think?
I like the idea @geoand. Maybe a slightly different approach would be to be able to allow a consumer to register a callback (like a Consumer<OpenAiClient.Builder<T, B>>
which would be called in the builder's build method (if set). It would eliminate the need to add a getEffectiveBuilder
method.
The problem I'm starting to see now, though, after the several conversations in this thread is that doing this here in openai4j only adds this capability to things that use this OpenAI client.
What about the other kinds of models (EmbeddingModel
, ImageModel
, LanguageModel
, and ModerationModel
) that are in Langchain4J?
If it would be helpful I could try out a small proof of concept in Langchain4j.
(like a Consumer<OpenAiClient.Builder<T, B>> which would be called in the builder's build method (if set). It would eliminate the need to add a getEffectiveBuilder method.
The reason I proposed the function, is that theoretically an advanced integration might want to change the builder entirely
The problem I'm starting to see now, though, after the several conversations in this thread is that doing this here in openai4j only adds this capability to things that use this OpenAI client
That's a serious shortcoming indeed.
Maybe we could have each model have a way to customize (in a general way like I proposed above) the underlying client
The reason I proposed the function, is that theoretically an advanced integration might want to change the builder entirely
That makes sense! Return a completely different builder so long as it is still an extension of it. Maybe even create a base level builder interface?
That's a serious shortcoming indeed.
Maybe we could have each model have a way to customize (in a general way like I proposed above) the underlying client
That's what I'm thinking too the more I think about this.
builder so long as it is still an extension of it
Mocking should be easy with this
Should I whip up a small prototyle within Langchain4J to prove this out? Or do you want to run with it?
Go ahead 😃
How did I know that was going to be your answer :)
Because my TZ is +7 of yours 😄
So are you saying you're calling it quits for the day? Slacker.... :D
@geoand / @langchain4j I put together a prototype here: https://github.com/langchain4j/langchain4j/pull/531
I'm going to close this in favor of what's going on in https://github.com/langchain4j/langchain4j/pull/531
Since integrating with OpenAI costs real money, this introduces a way that I can "turn off" the integration. Maybe in an environment instead of making a "real" call I serve a default response.
This capability will certainly be extended further downstream (see https://github.com/quarkiverse/quarkus-langchain4j/issues/130), but the capability needs to be in the builder methods here, so this at least needs to be here so that it can be extended.
The design I took here is to try to keep the client from having to know anything about what to do in the case it is turned off. The case is treated simply like any client exception, which bubbles up as an
OpenAiHttpException
with a message indicating that the request was disabled.I also added a bunch of test cases/classes which mirror the existing tests but enforce the fact that client is disabled. I may have gone overboard with the tests, but I thought better to be safe.
I'm happy to discuss and make any changes or alter naming conventions.