Open Prince-Mendiratta opened 7 months ago
π€
Hello again, @Prince-Mendiratta! Great to see you diving deep into langchainjs. Hope you're enjoying the experience as much as we enjoy seeing you here. Let's look into this issue.
Based on the information available, it seems that the ChatOpenAI
class in the langchainjs codebase does support caching. However, the provided context does not specify if it supports caching when used with the 'gpt-4-1106-preview' model specifically. The tests provided in the context show the usage of caching with the 'gpt-3.5-turbo' model. Here is the supporting evidence from the context:
test("ChatOpenAI can cache generations", async () => {
const memoryCache = new InMemoryCache();
const lookupSpy = jest.spyOn(memoryCache, "lookup");
const updateSpy = jest.spyOn(memoryCache, "update");
const chat = new ChatOpenAI({
modelName: "gpt-3.5-turbo",
maxTokens: 10,
n: 2,
cache: memoryCache,
});
const message = new HumanMessage("Hello");
const res = await chat.generate([[message], [message]]);
expect(res.generations.length).toBe(2);
expect(lookupSpy).toHaveBeenCalledTimes(2);
expect(updateSpy).toHaveBeenCalledTimes(2);
lookupSpy.mockRestore();
updateSpy.mockRestore();
});
In this test, an instance of ChatOpenAI
is created with a memoryCache
passed as the cache
option. The generate
method is then called twice with the same input, and the test checks that the cache's lookup
and update
methods are called the expected number of times. This indicates that the ChatOpenAI
class does support caching. However, without specific information about the 'gpt-4-1106-preview' model, it's not possible to confirm whether caching is supported for that model. You can find this test in the langchain/src/chat_models/tests/chatopenai.int.test.ts file.
Regarding your question about the response_format
set to json_object
, I wasn't able to find specific information about this within the repository. It's possible that the caching mechanism might behave differently when the response_format
is set to json_object
, but without further information or context, it's hard to say for sure.
If you could provide more details about the issue you're experiencing, such as any error messages or unexpected behavior, it would be helpful in diagnosing the problem. Also, if you could share more about your use case or the specific requirements for using the 'gpt-4-1106-preview' model with the response_format
set to json_object
, it might help in finding a solution or workaround.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a π if this is helpful and π if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
What do you mean "doesn't work"? It throws an error or it doesn't cache the response?
Ah, sorry for not being descriptive enough. By saying it doesnβt work, I meant that the response isnβt cached. With the same prompt, a new request is sent to the LLM always and no caching takes place. No error is thrown.
On Thu, 21 Dec 2023 at 10:55β―PM, Jacob Lee @.***> wrote:
What do you mean "doesn't work"? It throws an error or it doesn't cache the response?
β Reply to this email directly, view it on GitHub https://github.com/langchain-ai/langchainjs/issues/3686#issuecomment-1866680558, or unsubscribe https://github.com/notifications/unsubscribe-auth/AM4SPLGUCR6PZHSQ2INULFDYKRWIBAVCNFSM6AAAAABAXQXHCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRWGY4DANJVHA . You are receiving this because you were mentioned.Message ID: @.***>
Got it - I think I know what's causing it. Will try to dig in today.
CC @Prince-Mendiratta I looked into this in #3754 and wasn't able to repro with the in memory cache - can you try the following?
import { HumanMessage } from "@langchain/core/messages";
import { InMemoryCache } from "@langchain/core/caches";
import { ChatOpenAI } from "@langchain/openai";
const memoryCache = new InMemoryCache();
const chat = new ChatOpenAI({
modelName: "gpt-3.5-turbo-1106",
temperature: 1,
cache: memoryCache,
}).bind({
response_format: {
type: "json_object",
},
});
const message = new HumanMessage(
"Respond with a JSON object containing arbitrary fields."
);
const res = await chat.invoke([message]);
console.log(res);
const res2 = await chat.invoke([message]);
console.log(res2);
@jacoblee93 Thanks for looking into this! The example you shared does work, I think the issues lies when using LLMChain. Please try this example where I was able to reproduce the issue, I'm using the date log to determine if the response was cached or not:
import { InMemoryCache } from 'langchain/cache';
import { ChatOpenAI } from 'langchain/chat_models/openai';
import { LLMChain } from 'langchain/chains';
import { PromptTemplate } from 'langchain/prompts';
(async () => {
const memoryCache = new InMemoryCache();
const chat = new ChatOpenAI({
modelName: 'gpt-3.5-turbo-1106',
temperature: 1,
cache: memoryCache,
}).bind({
response_format: {
type: 'json_object',
},
});
const prompt_template = new PromptTemplate({
template: 'Respond with a JSON object containing arbitrary {app}.',
inputVariables: ['app'],
});
const chain = new LLMChain({
llm: chat,
prompt: prompt_template,
outputKey: 'res',
verbose: true,
});
console.log(new Date());
await chain.call({ app: 'fields' });
console.log(new Date());
await chain.call({ app: 'fields' });
console.log(new Date());
})();
Ah. We're moving towards deprecating LLMChain in favor of a prompt -> LLM -> output parser runnable:
https://js.langchain.com/docs/expression_language/cookbook/prompt_llm_parser
Would you be up for switching over to that?
I'm wondering if I'm hitting the same issue here or it is different?
const promptTemplate = PromptTemplate.fromTemplate(botPrompt);
const outputParser = new JsonOutputFunctionsParser();
const model = new ChatOpenAI({ openAIApiKey: service_token, modelName: 'gpt-4-turbo-preview', verbose: true, cache: llm_cache, timeout: 6000 });
const chain = RunnableSequence.from([promptTemplate, model.bind({ functions: [functionSchema], function_call: { name: 'extractor' } }), outputParser])
const outcome = await chain.invoke({ message }, {metadata: metaData }) as AIResponse;
In the above, I'm using a JsonOutputFunctionsParser along with the following functionSchema:
const functionSchema = {
name: 'extractor',
description: 'Extracts the relevant moderation decision based on the input',
parameters: {
type: "object",
properties: {
decision: {
type: 'string',
enum: ['D', 'K', 'R'],
description: 'The overall decision made'
},
},
},
required: ['decision']
}
In Redis I can see the key being created with the following content:
However - even though cached there are still tokens being used to execute I assume the extractor function. I would have assumed this whole process would result in a cache hit and simply return the stored content. For additional context heres the landsmith traces for the exact same runs. There is definitely some caching, but I'd have expected a near instant response and 0 token usage.
Hi, @Prince-Mendiratta,
I'm helping the langchainjs team manage their backlog and am marking this issue as stale. From what I understand, you reported an issue with the Redis cache not working with JSON mode on the gpt-4-1106-preview
model. There was a discussion with Jacoblee93 about potential causes, code examples were shared, and there was an agreement to deprecate LLMChain in favor of a prompt -> LLM -> output parser runnable. ImTheDeveloper also shared a similar issue with using a JsonOutputFunctionsParser and provided additional context and screenshots.
Could you please confirm if this issue is still relevant to the latest version of the langchainjs repository? If it is, please let the langchainjs team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!
This is still an issue!
Hi! I've noticed that the cache does not work with JSON mode.
Cache works well here:
This does not: