Open bugtig6351 opened 5 days ago
Attention: Patch coverage is 94.33962%
with 3 lines
in your changes missing coverage. Please review.
Project coverage is 81.21%. Comparing base (
c253114
) to head (95f6cd4
).
Files | Patch % | Lines |
---|---|---|
rageval/metrics/_context_recall.py | 66.66% | 1 Missing :warning: |
rageval/metrics/_context_reject_rate.py | 50.00% | 1 Missing :warning: |
rageval/metrics/base.py | 85.71% | 1 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Main changes:
_compute_one()
method and the final results are aggregated using np.average(scores)
, I will use the following _compute_batch()
as the default implementation for the Metric
class, and _compute_one()
as an abstract method that needs to be implemented by each metric individually. def _compute_batch(
self,
pred_answers: Optional[Iterable] = None,
ref_answers: Optional[Iterable] = None,
*args: Optional[Iterable]
) -> List[float]:
"""Compute the metric for a batch of predictions and references."""
scores = []
for pred, refs in tqdm(zip(pred_answers, ref_answers),
desc=f"Computing {self.name}",
total=len(pred_answers)):
scores.append(self._compute_one(pred, refs))
return scores
normalize=True
.There are two issuses:
MetricWithLLM
class involves model calls. The current approach is to package all the data and delegate the batch operation to the BaseLLM.generate()
method for processing.
Metric
class, take the_compute_one
as the abstract method and write a default implementation of_compute_batch
method.F1
,ChrF
andTer
metric.