Build an LLM caching layer

patterns-ai-core / langchainrb

Build LLM-powered applications in Ruby

https://rubydoc.info/gems/langchainrb

MIT License

1.37k stars 192 forks source link

Build an LLM caching layer #242

Closed andreibondarev closed 6 days ago

andreibondarev commented 1 year ago

We need a mechanism to cache identical and similar requests to the LLM. The Python de-facto solution is https://github.com/zilliztech/GPTCache.

weilandia commented 9 months ago

Shouldn't the cache layer be as simple as hashing the inputs to the api calls and caching the results that way? GPTCache looks like it's solving for a semantic cache which wouldn't be generally useful for all use cases (i.e. text extraction workflows that might be very semantically similar across queries).

It looks like GPTCache has that option, but it would be a much simpler thing to implement only exact hits first.

andreibondarev commented 9 months ago

@weilandia Yes, I think that would be a good order to build this out in: 1) exact inputs (hashed) as the key and then 2) "fuzzier" cache based on semantic similarity.

andreibondarev commented 6 days ago

Closing this for now as it seems that the LLM providers themselves will be supporting this out of the box.