run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.39k stars 4.98k forks source link

How can we pass hybrid_top_k as a parameter to VectorIndexRetriever #15304

Open Sanjit0910 opened 1 month ago

Sanjit0910 commented 1 month ago

Question Validation

Question

I'm developing a hybrid search-based chat engine.To setup the retriever, I am using following configuration:

retriever = index.as_retriever(similarity_top_k=6, sparse_top_k = 12, vector_store_query_mode="hybrid")

I understand that, by default, the retriever returns the number of documents specified by similarity_top_k. However, I need different top_k values for dense retrieval and hybrid retrieval results. While exploring the documentation, I found that VectorStoreQuery has a hybrid_top_k parameter to control the number of documents retrieved post fusion of dense and sparse results. Unfortunately, it seems that VectorIndexRetriever does not directly support this parameter.

Is there an alternative method to configure different top_k values for dense and hybrid retrievals within VectorIndexRetriever? If not, is there a recommended approach to achieve this behavior?

dosubot[bot] commented 1 month ago

Hello @Sanjit0910! I'm here to assist you with any bugs, questions, or contributions.

There is no alternative method to configure different top_k values for dense and hybrid retrievals within VectorIndexRetriever. The recommended approach to achieve this behavior is to use the sparse_top_k and similarity_top_k parameters.

For example, in the hybrid search mode, you can set sparse_top_k and similarity_top_k separately:

retriever = index.as_retriever(
    similarity_top_k=6,
    sparse_top_k=12,
    vector_store_query_mode="hybrid"
)

This configuration will use similarity_top_k for dense retrieval and sparse_top_k for sparse retrieval.

Additionally, if you are using the MultiModalVectorIndexRetriever from LlamaIndex, you can configure different top_k values for text and image retrievals using the similarity_top_k and image_similarity_top_k parameters respectively:

class MultiModalVectorIndexRetriever(MultiModalRetriever):
    def __init__(
        self,
        index: "MultiModalVectorStoreIndex",
        similarity_top_k: int = DEFAULT_SIMILARITY_TOP_K,
        image_similarity_top_k: int = DEFAULT_SIMILARITY_TOP_K,
        ...
    ) -> None:
        ...
        self._similarity_top_k = similarity_top_k
        self._image_similarity_top_k = image_similarity_top_k
        ...

This allows you to set different top_k values for text and image retrievals within the same retriever [1][2].

To continue talking to Dosu, mention @dosu.

Sanjit0910 commented 1 month ago

I attempted a workaround by modifying the retriever.py file in the llama-index-core package. Specifically, I adjusted the code to pass the hybrid_top_k parameter to the init and _build_vector_store_query functions. However, I'm unsure if this approach is recommended or if there's a better, more standard way to achieve this.

dosubot[bot] commented 1 month ago

Modifying the retriever.py file in the llama-index-core package to pass the hybrid_top_k parameter to the __init__ and _build_vector_store_query functions is a reasonable approach for configuring different top_k values for dense and hybrid retrievals within VectorIndexRetriever. The current implementation supports passing various parameters through the __init__ method and _build_vector_store_query function, so adding hybrid_top_k would be consistent with this design.

Here is how you can modify the __init__ method and _build_vector_store_query function to include the hybrid_top_k parameter:

class VectorIndexRetriever(BaseRetriever):
    """Vector index retriever.

    Args:
        index (VectorStoreIndex): vector store index.
        similarity_top_k (int): number of top k results to return.
        vector_store_query_mode (str): vector store query mode
            See reference for VectorStoreQueryMode for full list of supported modes.
        filters (Optional[MetadataFilters]): metadata filters, defaults to None
        alpha (float): weight for sparse/dense retrieval, only used for
            hybrid query mode.
        doc_ids (Optional[List[str]]): list of documents to constrain search.
        vector_store_kwargs (dict): Additional vector store specific kwargs to pass
            through to the vector store at query time.
        hybrid_top_k (Optional[int]): number of top k results for hybrid retrieval.
    """

    def __init__(
        self,
        index: VectorStoreIndex,
        similarity_top_k: int = DEFAULT_SIMILARITY_TOP_K,
        vector_store_query_mode: VectorStoreQueryMode = VectorStoreQueryMode.DEFAULT,
        filters: Optional[MetadataFilters] = None,
        alpha: Optional[float] = None,
        node_ids: Optional[List[str]] = None,
        doc_ids: Optional[List[str]] = None,
        sparse_top_k: Optional[int] = None,
        hybrid_top_k: Optional[int] = None,
        callback_manager: Optional[CallbackManager] = None,
        object_map: Optional[dict] = None,
        embed_model: Optional[BaseEmbedding] = None,
        verbose: bool = False,
        **kwargs: Any,
    ) -> None:
        """Initialize params."""
        self._index = index
        self._vector_store = self._index.vector_store
        self._embed_model = embed_model or self._index._embed_model
        self._docstore = self._index.docstore

        self._similarity_top_k = similarity_top_k
        self._vector_store_query_mode = VectorStoreQueryMode(vector_store_query_mode)
        self._alpha = alpha
        self._node_ids = node_ids
        self._doc_ids = doc_ids
        self._filters = filters
        self._sparse_top_k = sparse_top_k
        self._hybrid_top_k = hybrid_top_k
        self._kwargs: Dict[str, Any] = kwargs.get("vector_store_kwargs", {})
        self.callback_manager = (
            callback_manager
            or callback_manager_from_settings_or_context(
                Settings, self._service_context
            )
        )
        super().__init__(
            callback_manager=self.callback_manager,
            object_map=object_map,
            verbose=verbose,
        )

    @dispatcher.span
    def _build_vector_store_query(
        self, query_bundle_with_embeddings: QueryBundle
    ) -> VectorStoreQuery:
        return VectorStoreQuery(
            query_embedding=query_bundle_with_embeddings.embedding,
            similarity_top_k=self._similarity_top_k,
            node_ids=self._node_ids,
            doc_ids=self._doc_ids,
            query_str=query_bundle_with_embeddings.query_str,
            mode=self._vector_store_query_mode,
            alpha=self._alpha,
            filters=self._filters,
            sparse_top_k=self._sparse_top_k,
            hybrid_top_k=self._hybrid_top_k,
        )

This modification allows you to configure different top_k values for dense and hybrid retrievals by passing the hybrid_top_k parameter when initializing the VectorIndexRetriever [1][2].

To continue talking to Dosu, mention @dosu.