run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.9k stars 5.28k forks source link

[Question]: Rerank CBE #16433

Open calimbrasti94 opened 1 month ago

calimbrasti94 commented 1 month ago

Question Validation

Question

Hi! I am using the cohere reranker as a post-processor, switching to VoyageAI reraker to doublecheck. my question is: is it possible that the callback manager can't track the "CBE rerank" in with this engine setup for chat_engine:

    chat_engine = index.as_chat_engine(
        query_engine=query_engine, 
        chat_mode="context",  # also tried with "best" with a different log result but no CBE type rerank 
        memory=memory,
        verbose=True
    )

I get this as a callback log:

Trace: chat |_CBEventType.RETRIEVE -> 1.86579 seconds |_CBEventType.EMBEDDING -> 1.741489 seconds |_CBEventType.SYNTHESIZE -> 25.752929 seconds |_CBEventType.TEMPLATING -> 0.0 seconds |_CBEventType.LLM -> 25.700519 seconds

while when using a query_engine with "CondenseQuestion" it manages to track it and show the rerank CBE. is it possible that with query_engine it could track it and with chat_engine it could not ? I ask this because I wanted to make sure that the reranker is being used, since I checked the usage of the Cohere api keys and it does indeed seem that calls are being made. while on the other hand with VoyageAI no call seems to happen, but it could be that they update the data more slowly.

Thanks! piece of the code with rerank and else:

    #Initialize Cohere Rerank
    cohere_api_key = os.environ.get("COHERE_API_KEY")
    if not cohere_api_key:
        raise ValueError("COHERE_API_KEY environment variable is not set")

    reranker = CohereRerank(
        model="rerank-multilingual-v3.0",
        api_key=cohere_api_key,
        top_n=5
    )

    # # Initialize Voyage Rerank
    # voyage_api_key = os.environ["VOYAGE_API_KEY"]
    # if not voyage_api_key:
    #     raise ValueError("VOYAGE_API_KEY environment variable is not set")

    # reranker = VoyageAIRerank(
    #     api_key=voyage_api_key, 
    #     top_k=5, 
    #     model="rerank-2", 
    #     truncation=True
    # )

    # Define the retrievers
    vector_retriever = index.as_retriever(similarity_top_k=20)
    bm25_retriever = BM25Retriever.from_defaults(docstore=index.docstore, similarity_top_k=20)

    # Define the QueryFusionRetriever
    retriever = QueryFusionRetriever(
        [vector_retriever, bm25_retriever],
        similarity_top_k=20,
        num_queries=4,
        mode="reciprocal_rerank",
        use_async=True,
        verbose=True,
    )

    # Define the query engine with the reranker as a postprocessor
    query_engine = RetrieverQueryEngine.from_args(
        callback_manager=callback_manager,
        retriever=retriever,
        response_mode="compact",
        verbose=True,
        node_postprocessors=[reranker]
    )

    memory = ChatMemoryBuffer.from_defaults(chat_history=messages or [])

    chat_engine = index.as_chat_engine(
        query_engine=query_engine, 
        chat_mode="best",
        memory=memory,
        verbose=True
    ) 
dosubot[bot] commented 1 month ago

Hey there, @calimbrasti94! I'm here to help you with any bugs, questions, or contributions. Let's tackle this together!

The issue you're experiencing with the callback manager not tracking the "CBE rerank" event in the chat engine setup could be due to differences in how the chat engine and query engine handle callback events.

In the chat engine, there is no explicit query event, which might affect how certain events, like reranking, are logged. The query text in the chat engine is set to the last message passed to the LLM, which differs from the query engine where the query event is directly captured [1].

If the reranker is being used, you should see API calls being made, as you noticed with the Cohere API. However, the absence of specific callback events in the logs might be due to the chat engine's handling of events. You might want to ensure that the reranker is correctly integrated and that the callback manager is set up to capture all relevant events.

For further troubleshooting, you could:

  1. Double-check the integration of the reranker in both engines.
  2. Ensure that the callback manager is configured to log all necessary events.
  3. Compare the callback logs between the chat and query engines to identify any discrepancies.

If the VoyageAI API calls are not appearing, it might be worth checking the API key setup and any potential delays in their logging system.

To continue talking to Dosu, mention @dosu.

calimbrasti94 commented 1 month ago

@dosu then is the code snippet that encapsulates the logic of retrievers, queries, and chats correct or is it used incorrectly?

'''

#Initialize Cohere Rerank
cohere_api_key = os.environ.get("COHERE_API_KEY")
if not cohere_api_key:
    raise ValueError("COHERE_API_KEY environment variable is not set")

reranker = CohereRerank(
    model="rerank-multilingual-v3.0",
    api_key=cohere_api_key,
    top_n=5
)

# # Initialize Voyage Rerank
# voyage_api_key = os.environ["VOYAGE_API_KEY"]
# if not voyage_api_key:
#     raise ValueError("VOYAGE_API_KEY environment variable is not set")

# reranker = VoyageAIRerank(
#     api_key=voyage_api_key, 
#     top_k=5, 
#     model="rerank-2", 
#     truncation=True
# )

# Define the retrievers
vector_retriever = index.as_retriever(similarity_top_k=20)
bm25_retriever = BM25Retriever.from_defaults(docstore=index.docstore, similarity_top_k=20)

# Define the QueryFusionRetriever
retriever = QueryFusionRetriever(
    [vector_retriever, bm25_retriever],
    similarity_top_k=20,
    num_queries=4,
    mode="reciprocal_rerank",
    use_async=True,
    verbose=True,
)

# Define the query engine with the reranker as a postprocessor
query_engine = RetrieverQueryEngine.from_args(
    callback_manager=callback_manager,
    retriever=retriever,
    response_mode="compact",
    verbose=True,
    node_postprocessors=[reranker]
)

memory = ChatMemoryBuffer.from_defaults(chat_history=messages or [])

chat_engine = index.as_chat_engine(
    query_engine=query_engine, 
    chat_mode="best",
    memory=memory,
    verbose=True
) 

'''

dosubot[bot] commented 1 month ago

Il frammento di codice per i retrievers, il query engine e il chat engine sembra essere implementato correttamente. Ecco un riepilogo delle parti principali:

  1. Cohere Rerank: Inizializzato correttamente con il modello e la chiave API. Assicurati che la variabile d'ambiente COHERE_API_KEY sia impostata.

  2. Retrievers:

    • vector_retriever e bm25_retriever sono definiti correttamente.
    • QueryFusionRetriever è configurato con i parametri predefiniti, come similarity_top_k, num_queries, e mode.
  3. Query Engine:

    • Creato con RetrieverQueryEngine e include il reranker come postprocessore.
  4. Chat Engine:

    • Configurato con index.as_chat_engine utilizzando il query_engine e la modalità di chat "best".

Tutto sembra in ordine, ma assicurati che le chiavi API siano corrette e che il callback_manager sia configurato per tracciare tutti gli eventi necessari.

To continue talking to Dosu, mention @dosu.