[Bug]: mmr_threshold not suported by ChromaVectorStore

jfkoh commented 9 months ago

Bug Description

I previously implemented a VectorIndexRetriever using LlamaIndex's built-in vector index without any vector database, with MMR mode and mmr_threshold. It worked fine.

I then added ChromaDB and found that MMR mode works as long as you don't include the mmr_threshold. If you set the mmr_threshold in the vector_store_kwargs argument, you get an error.

Version

llama-index-0.10.1

Steps to Reproduce

Here are the relevant parts of my code. If I comment out the line indicated by "# ERROR", then the code works.

from llama_index.core.indices.vector_store.base import VectorStoreIndex
from llama_index.core.storage import StorageContext
from llama_index.core.indices.vector_store.retrievers.retriever import (
    VectorIndexRetriever,
)
from llama_index.core.query_engine.retriever_query_engine import (
    RetrieverQueryEngine
)
from llama_index.core import Settings
from llama_index.vector_stores.chroma.base import ChromaVectorStore
import chromadb

def run_query(
    question: str,
    vectorstore: str,
    top_k: int,
    mmr_threshold: float,
) -> RetrieverQueryEngine | None:
    '''
    Return an LLM response to an input query after doing a vector search.

    Args:
        question (str): The query to the LLM.
        vectorstore (str): Folder name of vector database.
        top_k (int): Number of retrievals or citations to retrieve via
            vector search.
        mmr_threshold (float): A float between 0 and 1, for MMR search mode.
            Closer to 0 gives you more diversity in the retrievals.
            Closer to 1 gives you more relevance in the retrievals.

    Returns:
        RetrieverQueryEngine | None: If vectorstore location exists, return a
            Response object from RetrieverQueryEngine, else return nothing.
    '''

    if not os.path.exists(vectorstore):
        print('Error: Vectorstore', vectorstore, 'not found!')
        return
    else:
        # Instantiate a Chroma client, setting storage folder location:
        client = chromadb.PersistentClient(path=vectorstore)

        # Instantiate a Chroma collection based on the client:
        chroma_collection = client.get_or_create_collection(vectorstore)

        # Instantiate a ChromaVectorStore based on the Chroma collection:
        vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

        # Instantiate a storage context based on the ChromaVectorStore:
        storage_context = StorageContext.from_defaults(
            vector_store=vector_store
        )

        # Instantiate LLM and embedding model:
        llm, embedding = create_azure_models()

        # Add these 2 models to the LlamaIndex Settings:
        Settings.llm = llm
        Settings.embed_model = embedding

        index = VectorStoreIndex.from_vector_store(
            vector_store,
            storage_context=storage_context
        )

        # Instantiate and configure a VectorIndexRetriever:
        # Note about parameters:
        # similarity_top_k sets the number of retrievals (citations).
        # mmr_threshold is a value between 0 and 1.
        # Closer to 0 gives you more diversity.
        # Closer to 1 gives you more relevance.
        # If the data contains duplicated entries, set it lower (e.g. 0.2)
        # so that the retriever will skip over search results that are
        # identical or very similar and go for greater diversity.
        retriever = VectorIndexRetriever(
            index=index,
            similarity_top_k=top_k,
            vector_store_query_mode='mmr',
            vector_store_kwargs={'mmr_threshold': mmr_threshold}  # ERROR
        )

        # Instantiate RetrieverQueryEngine and pass in the VectorIndexRetriever:
        query_engine = RetrieverQueryEngine(
            retriever=retriever
        )

        # Query the index:
        response = query_engine.query(question)

        return response

Relevant Logs/Tracbacks

File ".../text_data_helpers.py", line 142, in run_query
    response = query_engine.query(question)
  File ".../lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py", line 40, in query
    return self._query(str_or_query_bundle)
  File ".../lib/python3.10/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 186, in _query
    nodes = self.retrieve(query_bundle)
  File ".../lib/python3.10/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 142, in retrieve
    nodes = self._retriever.retrieve(query_bundle)
  File ".../lib/python3.10/site-packages/llama_index/core/base/base_retriever.py", line 229, in retrieve
    nodes = self._retrieve(query_bundle)
  File ".../lib/python3.10/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 94, in _retrieve
    return self._get_nodes_with_embeddings(query_bundle)
  File ".../lib/python3.10/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 170, in _get_nodes_with_embeddings
    query_result = self._vector_store.query(query, **self._kwargs)
  File ".../lib/python3.10/site-packages/llama_index/vector_stores/chroma/base.py", line 292, in query
    results = self._collection.query(
TypeError: Collection.query() got an unexpected keyword argument 'mmr_threshold'

dosubot[bot] commented 9 months ago

🤖

Hey there, @jfkoh! Good to see you back here. Hope all is well on your end!

The error message you're encountering, "Collection.query() got an unexpected keyword argument 'mmr_threshold'", is due to the 'mmr_threshold' argument not being a recognized parameter for the 'query' method of the 'ChromaVectorStore' class in LlamaIndex version 0.10.1.

In the 'query' method of the 'ChromaVectorStore' class, it accepts the following arguments: 'query' and 'kwargs'. The 'query' argument is an instance of 'VectorStoreQuery', and 'kwargs' is a dictionary that can contain additional arguments. However, 'mmr_threshold' is not a recognized argument in this method, hence the TypeError.

To resolve this issue, you should remove the 'mmr_threshold' argument from the 'query' method call. If 'mmr_threshold' is a necessary parameter for your use case, you may need to implement a custom solution that incorporates this parameter, or check if a newer version of LlamaIndex supports this argument.

Here is the relevant code snippet:

def query(self, query: VectorStoreQuery, **kwargs: Any) -> VectorStoreQueryResult:
    """Query index for top k most similar nodes.

    Args:
        query_embedding (List[float]): query embedding
        similarity_top_k (int): top k most similar nodes
    """
    if query.filters is not None:
        if "where" in kwargs:
            raise ValueError(
                "Cannot specify metadata filters via both query and kwargs. "
                "Use kwargs only for chroma specific items that are "
                "not supported via the generic query interface."
            )
        where = _to_chroma_filter(query.filters)
    else:
        where = kwargs.pop("where", {})

    results = self._collection.query(
        query_embeddings=query.query_embedding,
        n_results=query.similarity_top_k,
        where=where,
        **kwargs,
    )
    ...

You can find this in the llama_index/vector_stores/chroma.py file.

Regarding the 'mmr_threshold' argument, it is used to control the trade-off between relevance and diversity when using the Maximum Marginal Relevance (MMR) retrieval method. The MMR retrieval method is used to iteratively find documents that are dissimilar to previous results. The 'mmr_threshold' parameter, represented by λ in the MMR algorithm, determines the balance between relevance (similarity between a document and the query) and diversity (dissimilarity between a document and already selected documents).

If 'mmr_threshold' is close to 1, more emphasis is put on relevance, while a 'mmr_threshold' close to 0 puts more emphasis on diversity. As the threshold goes to 1, the scores will become equal and similarity to previous nodes will be ignored, turning off the impact of MMR. By lowering the threshold, the algorithm will prefer more diverse documents.

Here is how you can set the 'mmr_threshold' in the LlamaIndex framework:

from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine_with_threshold = index.as_query_engine(
    vector_store_query_mode="mmr", vector_store_kwargs={"mmr_threshold": 0.2}
)

response = query_engine_with_threshold.query(
    "What did the author do growing up?"
)
print(response)

In this example, the 'mmr_threshold' is set to 0.2, which means the MMR retrieval method will put more emphasis on diversity. You can find this in the docs/examples/vector_stores/SimpleIndexDemoMMR.ipynb file.

I hope this helps! If you have any other questions, feel free to ask.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

logan-markewich commented 9 months ago

This isn't a llama-index error? Chroma does not have an mmr_threshold kwarg it seems.

In fact, MMR is not mentioned anywhere in their docs, I don't think they support it

jfkoh commented 9 months ago

Thanks Logan for your comment :)

I've seen Langchain implement something like mmr_threshold for Chroma but they call it lambda_mult or score_threshold. See this page: https://api.python.langchain.com/en/v0.0.342/vectorstores/langchain.vectorstores.chroma.Chroma.html

I tried those 2 terms in LlamaIndex but they don't work.

logan-markewich commented 9 months ago

Well, I guess we don't have it implemented at the moment :) prs are very welcome

cmosguy commented 6 months ago

@logan-markewich so no one fixed this yet? Seems like a really critical thing to have for proper RAG retrieval.

logan-markewich commented 6 months ago

@cmosguy mmr threshold isn't too widely used. It hasn't been requested since this issue was opened. And no one has contributed it. PRs are welcome

ProMicke commented 2 months ago

@logan-markewich Has MMR still not been added in this case?

run-llama / llama_index