Open aclifton314 opened 1 month ago
The issue you're encountering, where the Mean Average Precision (MAP) evaluates to zero, is likely due to a mismatch between the retrieved_ids
and expected_ids
. Here are a few potential reasons and suggestions to address this:
Document ID Consistency: Ensure that the document IDs (doc_id
) used in your expected_ids
are consistent with those stored in the Weaviate vector store. If there's any discrepancy in how IDs are assigned or retrieved, it could lead to mismatches.
Query and Retrieval Logic: Verify that the queries generated by RagDatasetGenerator
are correctly aligned with the documents in your vector store. The queries should be relevant to the content of the documents to ensure that the retriever can find the correct matches.
Retriever Configuration: Check the configuration of your retriever. Ensure that the similarity_top_k
parameter is set appropriately to retrieve a sufficient number of relevant documents. If it's too low, relevant documents might not be retrieved.
Chunk ID Retrieval: In your _get_chunk_id
method, ensure that the logic for matching text to chunk IDs is correct. The method should accurately find the corresponding doc_id
for the given text. The return of '999999'
might indicate a fallback or error case, which could lead to incorrect expected_ids
.
Debugging: Add logging or print statements to inspect the retrieved_ids
and expected_ids
during evaluation. This can help identify where the mismatch occurs.
By addressing these areas, you should be able to identify why the retrieved_ids
are not matching the expected_ids
and adjust your implementation accordingly.
To continue talking to Dosu, mention @dosu.
@aclifton314 the retrieved ids and expected ids are based on the node ids from your index.
When you load documents, documents get split into chunks, and .generate_questions_from_nodes
is generating questing for each chunk, with the assumption that retrieving with that question should return the associated node id in the top k
However, it seems like flare.chunks
is pointing to the input documents rather than the actual nodes
A setup like this will probably work
from llama_index.core.node_parser import SentenceSplitter
splitter = SentenceSplitter(chunk_size=1024, chunk_overlap=128)
...
sdr = SimpleDirectoryReader(self.data_dir)
documents = sdr.load_data()
self.chunks = splitter(documents)
storage_context = StorageContext.from_defaults(vector_store=vectorstore)
self.index = VectorStoreIndex(
nodes=self.chunks, storage_context=storage_context, embed_model=Settings.embed_model
)
@logan-markewich I think I see what you are saying. A Document
chunk has it's own id while a Node
chunk can have a different id. Since .generate_questions_from_nodes
is expecting Node
objects to retrieve ids from and my expected_ids
(as eval_tup[1]
) has ids from Document
objects, that would account for the different in id lists. Have I understood correctly?
Question Validation
Question
llama-index: 0.10.62 Python 3.11.9
Hi Llama-Index Community!
I think I am messing something up when trying to calculate Mean Average Precision but I am not entirely sure and could use the help of the community. Here is some sample code:
Imports:
A class to create a
FLAREInstructQueryEngine
:A class to evaluate the Mean Average Precision:
Main:
What I end up getting is
map = 0
. When I investigated this further, I noticed that none of theretrieved_ids
that come fromBaseRetrievalEvaluator.aevaluate()
are in theexpected_ids
of that class. I would have expected that theretrieved_ids
get pulled from the weaviate database somehow.Any help is much appreciated. Thanks in advance.