Closed ipoletaev closed 4 months ago
It seems you're exploring how to integrate a PandasQueryEngine
for structured search with a VectorStoreIndex
for semantic search into a composite retriever, aiming to leverage both in a unified retrieval process. This approach is indeed innovative and aligns with advanced search and retrieval paradigms.
In the context of the LlamaIndex repository, while there isn't a direct example that matches your exact use case, the repository does contain examples of how different search methods can be combined. For instance, the SummaryIndex
class demonstrates the aggregation of different node types (TextNode
and IndexNode
) for retrieval through a single instance. This is akin to creating a composite retriever, as seen in the test_composable_retrieval
function within the test_composable_retriever.py file.
Moreover, the VectorIndexAutoRetriever
class, detailed in auto_retriever.py, showcases an approach to automatically set vector store query parameters using a Language Model (LLM). This class essentially acts as a composite retriever by utilizing VectorStoreIndex
for vector-based retrieval, which could be adapted to include structured search functionalities akin to those provided by a PandasQueryEngine
.
While these examples provide a foundation for combining different search methods, they may not fully address your requirement to run both the pandas query engine and the vector index in parallel, ensuring both methods are always utilized. Implementing this specific functionality might require additional logic to orchestrate the parallel execution and integration of results from both search methods.
Adapting the mentioned examples to fit your specific needs could be a viable starting point. If you require further assistance or have more detailed questions about implementing this composite retriever, feel free to provide additional context or specifics about your implementation goals.
Question Validation
Question
Hi!
What's the right way of enforcing retrieval against both pandas query engine that does structured search over a table and a parallel vector index which implements semantic search over the same table (using rows as text representation) such that the parent recursive retriever if decides to even explore the table info always runs both pandas query engine and this vector index?
Namely:
and I'd like see if it is possible to reuse existing implementation (if any) to accomplish this:
The motivation is that I already have a good re-ranker implementation to fuse together semantic search + table retrievals and I just want to always use both.
Thanks!