run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.44k stars 5.2k forks source link

[Question]: How to implement LLM caching #14995

Open Tejaswgupta opened 3 months ago

Tejaswgupta commented 3 months ago

Question Validation

Question

We're using a simple query engine to answer questions based off two documents. Since the chances of repetition are high, a caching mechanism like GPTCache(https://github.com/zilliztech/GPTCache) would be really helpful. However, going through their documentation, the references to llama-index are outdated and llama index's documentation doesn't contain any reference to implementing caching.

Any help or pointers to implement this would be great! thank you!

dosubot[bot] commented 3 months ago

To implement caching with GPTCache for a simple query engine using LlamaIndex, you can use the IngestionCache class from the updated LlamaIndex code. This class supports different types of caches, including SimpleCache, RedisCache, MongoDBCache, and FirestoreCache. Here is an example of how you can set up caching:

from llama_index.core.schema import BaseNode
from llama_index.core.storage.kvstore import SimpleKVStore as SimpleCache
from llama_index.core.ingestion.cache import IngestionCache
from typing import List, Optional

# Initialize the cache
cache = IngestionCache(cache=SimpleCache())

# Function to put data into the cache
def put_data_into_cache(key: str, nodes: List[BaseNode]):
    cache.put(key, nodes)

# Function to get data from the cache
def get_data_from_cache(key: str) -> Optional[List[BaseNode]]:
    return cache.get(key)

# Example usage
key = "example_key"
nodes = [BaseNode()]  # Replace with actual nodes

# Put data into the cache
put_data_into_cache(key, nodes)

# Get data from the cache
cached_nodes = get_data_from_cache(key)
if cached_nodes:
    print("Data retrieved from cache")
else:
    print("No data found in cache")

This example demonstrates how to initialize the cache, put data into it, and retrieve data from it. You can replace SimpleCache with any other supported cache type like RedisCache, MongoDBCache, or FirestoreCache based on your requirements. The updated IngestionCache class also includes methods for clearing and persisting the cache, which can be useful for managing cache data over time [1][2].

To continue talking to Dosu, mention @dosu.