Open danerlt opened 1 month ago
To address the issue where the RetrieverEvaluator
results in an error when no nodes are retrieved, you can modify the HitRate
class to handle cases where retrieved_ids
or expected_ids
are not provided. This ensures that the evaluation does not raise an error and instead returns a score of 0.0.
Here is the updated HitRate
class:
class HitRate(BaseRetrievalMetric):
"""Hit rate metric: Compute hit rate with two calculation options.
- The default method checks for a single match between any of the retrieved docs and expected docs.
- The more granular method checks for all potential matches between retrieved docs and expected docs.
Attributes:
use_granular_hit_rate (bool): Determines whether to use the granular method for calculation.
metric_name (str): The name of the metric.
"""
metric_name: str = "hit_rate"
use_granular_hit_rate: bool = False
def compute(
self,
query: Optional[str] = None,
expected_ids: Optional[List[str]] = None,
retrieved_ids: Optional[List[str]] = None,
expected_texts: Optional[List[str]] = None,
retrieved_texts: Optional[List[str]] = None,
) -> RetrievalMetricResult:
"""Compute metric based on the provided inputs.
Parameters:
query (Optional[str]): The query string (not used in the current implementation).
expected_ids (Optional[List[str]]): Expected document IDs.
retrieved_ids (Optional[List[str]]): Retrieved document IDs.
expected_texts (Optional[List[str]]): Expected texts (not used in the current implementation).
retrieved_texts (Optional[List[str]]): Retrieved texts (not used in the current implementation).
Returns:
RetrievalMetricResult: The result with the computed hit rate score.
"""
# Checking for the required arguments
if (
retrieved_ids is None
or expected_ids is None
or not retrieved_ids
or not expected_ids
):
return RetrievalMetricResult(score=0.0)
if self.use_granular_hit_rate:
# Granular HitRate calculation: Calculate all hits and divide by the number of expected docs
expected_set = set(expected_ids)
hits = sum(1 for doc_id in retrieved_ids if doc_id in expected_set)
score = hits / len(expected_ids) if expected_ids else 0.0
else:
# Default HitRate calculation: Check if there is a single hit
is_hit = any(id in expected_ids for id in retrieved_ids)
score = 1.0 if is_hit else 0.0
return RetrievalMetricResult(score=score)
This modification ensures that if either expected_ids
or retrieved_ids
is None
or an empty list, the method will return a score of 0.0 without raising an error [1].
To continue talking to Dosu, mention @dosu.
Bug Description
When the retriever does not find a node, using RetrieverEvaluator to evaluate will result in an error: "Retrieved ids and expected ids must be provided."
Version
0.10.43
Steps to Reproduce
When the retriever does not find a node, using RetrieverEvaluator to evaluate will result in an error: "Retrieved ids and expected ids must be provided."
Relevant Logs/Tracbacks