Closed brycecf closed 3 months ago
For scaling a web service using a SummaryIndex in OpenSearch, consider these strategies:
For more efficient approaches leveraging existing databases or document stores:
In the LlamaIndex repository, the SummaryIndex is scaled through:
DocumentSummaryIndexLLMRetriever
processes summaries in batches, optimizing LLM calls.DocumentSummaryIndexEmbeddingRetriever
uses vector embeddings for efficient similarity search, scaling with large datasets.These strategies address the challenges of scaling and index reconstruction at large scale, leveraging existing databases or document stores more efficiently.
Question Validation
Question
Does the LlamaIndex team have recommendations on how to use a SummaryIndex, at scale, in a web service?
I currently use OpenSearch as my vector store, and the usage pattern essentially mirrors that of any other web service that needs to query a database: if the web service needs to scale up, it does so and the database should remain effectively accessible (minus implied increased usage).
Unlike with a VectorStoreIndex, with something like the SummaryIndex, that pattern no longer holds true. At least based on the documentation examples, I only see SummaryIndexes locally. At a small scale of documents, you could potentially afford to reconstruct the SummaryIndex each time, but that would quickly become an issue (plus needlessly redundant).
My idea is to use a bucket store or caching service, but I want to confirm there is not already a better way utilizing existing databases/docstores.