Closed AI-ANK closed 6 months ago
The issue you're encountering with DocumentSummaryIndexLLMRetriever
and DocumentSummaryIndexEmbeddingRetriever
focusing on only one city in the response.metadata
and response.source_nodes
, despite a query involving multiple topics, is likely due to the default settings of the choice_top_k
and similarity_top_k
parameters. These parameters are set to 1
by default, which means the retrievers will only select the top 1 most relevant summary node from the index based on the LLM's prediction or the embedding's similarity score, respectively.
To address this, you can adjust the choice_top_k
for DocumentSummaryIndexLLMRetriever
and similarity_top_k
for DocumentSummaryIndexEmbeddingRetriever
to a higher value. This adjustment will allow the retrieval of document chunks or summaries related to both topics in your query, such as both cities in the example you provided.
For more detailed control, especially if your queries vary widely in scope or complexity, consider using the VectorIndexAutoRetriever
class. This class allows for dynamic setting of vector store query parameters, including similarity_top_k
, based on the output of an LLM, potentially offering a more nuanced approach to retrieving relevant document summaries or chunks for complex queries.
You can find more details and adjust these parameters in the source code of the retrievers, which should help in getting responses that cover all queried topics more effectively.
The issue you're encountering with
DocumentSummaryIndexLLMRetriever
andDocumentSummaryIndexEmbeddingRetriever
focusing on only one city in theresponse.metadata
andresponse.source_nodes
, despite a query involving multiple topics, is likely due to the default settings of thechoice_top_k
andsimilarity_top_k
parameters. These parameters are set to1
by default, which means the retrievers will only select the top 1 most relevant summary node from the index based on the LLM's prediction or the embedding's similarity score, respectively.To address this, you can adjust the
choice_top_k
forDocumentSummaryIndexLLMRetriever
andsimilarity_top_k
forDocumentSummaryIndexEmbeddingRetriever
to a higher value. This adjustment will allow the retrieval of document chunks or summaries related to both topics in your query, such as both cities in the example you provided.For more detailed control, especially if your queries vary widely in scope or complexity, consider using the
VectorIndexAutoRetriever
class. This class allows for dynamic setting of vector store query parameters, includingsimilarity_top_k
, based on the output of an LLM, potentially offering a more nuanced approach to retrieving relevant document summaries or chunks for complex queries.You can find more details and adjust these parameters in the source code of the retrievers, which should help in getting responses that cover all queried topics more effectively.
Sources
- llama-index-core/llama_index/core/indices/document_summary/retrievers.py
- llama-index-core/llama_index/core/indices/vector_store/retrievers/auto_retriever/auto_retriever.py
About Dosu
Thanks @dosubot. Can you tell me the difference between DocumentSummaryIndexLLMRetriever and DocumentSummaryIndexEmbeddingRetriever?
The primary difference between DocumentSummaryIndexLLMRetriever
and DocumentSummaryIndexEmbeddingRetriever
lies in their retrieval mechanisms and the technologies they leverage:
DocumentSummaryIndexLLMRetriever utilizes Large Language Models (LLMs) to interpret queries and select the most relevant document summaries from an index. It sends queries and summary nodes to the LLM, which then identifies the most pertinent summaries based on its understanding of the query's context and content.
DocumentSummaryIndexEmbeddingRetriever, on the other hand, employs vector embeddings for both queries and document summaries to perform similarity searches. It generates or uses pre-existing embeddings to find the top-k most similar document summaries to a given query, based on vector space similarity measures.
The choice between these two depends on your specific needs:
DocumentSummaryIndexLLMRetriever
might be more suitable. However, it could be slower and more resource-intensive due to the need for real-time LLM inference.DocumentSummaryIndexEmbeddingRetriever
offers advantages. It leverages the speed of vector similarity searches, which can be highly optimized for performance.Each retriever's effectiveness in a comprehensive document searching tool will depend on factors like query complexity, resource availability, and scalability requirements. Adjusting parameters such as choice_top_k
for the LLM retriever and similarity_top_k
for the embedding retriever can help in retrieving more relevant summaries across multiple topics, as seen in your query about cities.
To continue the conversation, mention @dosu.
@dosu @dosubot thanks. For my use case, I am trying to build a file search RAG system. Basically I have a collection of documents, over which I apply a RAG system. Users will ask questions like "I need a document that talks about Llama 2", then the RAG system should be able to fetch top k documents (not chunks) that are relevant to the query, along with their document summary followed by a LLM generated justification of why this document was shortlisted for the given user query. I believe DocumentSummaryIndexLLMRetriever would be used in my case. @dosu @dosubot can you help with how I can implement this?
To build a file search system using a Retrieval-Augmented Generation (RAG) approach with DocumentSummaryIndexLLMRetriever
, consider leveraging the FusedSummaryRetriever
for enhanced retrieval. This approach combines document summaries with relevant fragments, improving context understanding and retrieval performance. Here's a streamlined guide:
Initialize Document Summary Index: Populate it with summaries of your documents to serve as the retrieval foundation.
Use FusedSummaryRetriever: This retriever enhances context by including both document summaries and relevant fragments. Initialize it with your vector store and document stores.
from llama_index.packs.docugami_kg_rag.helpers.fused_summary_retriever import FusedSummaryRetriever
fused_summary_retriever = FusedSummaryRetriever(
vectorstore=my_vector_store,
full_doc_summary_store=my_full_doc_summary_store,
parent_doc_store=my_parent_doc_store,
)
FusedSummaryRetriever
for dynamic retrieval based on query context.from llama_index.packs.self_rag.base import SelfRAGQueryEngine
self_rag_query_engine = SelfRAGQueryEngine(
model_path="path_to_your_model",
retriever=fused_summary_retriever,
)
Perform Searches: Process user queries to retrieve relevant documents and generate justifications with the custom_query
method of the SelfRAGQueryEngine
.
Optimize for Relevance and Efficiency: Experiment with retrieval settings and consider fine-tuning the embedding model to improve retrieval quality. Adjust parameters like temperature
, top_p
, and max_tokens
in the SelfRAGQueryEngine
for optimal performance.
By following these steps and utilizing the FusedSummaryRetriever
within the SelfRAGQueryEngine
, you can create a sophisticated file search system that efficiently retrieves relevant documents and provides justifications for their selection. For detailed implementation, refer to the LlamaIndex documentation.
To continue the conversation, mention @dosu.
Question Validation
Question
I am following this code as is: https://docs.llamaindex.ai/en/stable/examples/index_structs/doc_summary/DocSummary/
When I try asking questions that will have information in different documents like "Tell me about history of Chicago and Mumbai", in this case while the response seems alright. But when I try to see the nodes that were used for getting the response by using response.source_nodes or even response.metadata, all the chunks/documents seem to be of only one city. Ideally, we should get back chunks from both the documents. Am I missing something?
What I am trying to do is build a document repository searching tool. Basically, I have large collection of documents and I want to create summaries of each one and save it in vector index. Then user will query about some required information and the tool's job is to search through all the documents and provide a list of top 'n' matches along with the justification for the match. I thought DocumentSummaryIndexLLMRetriever and DocumentSummaryIndexEmbeddingRetriever would be the best approach but this source_nodes/metadata issue is creating problem.