obss / jury

Comprehensive NLP Evaluation System
MIT License
185 stars 20 forks source link

ZeroDivisionError: division by zero in AccuracyForLanguageGeneration._compute_single_pred_single_ref #122

Closed NISH1001 closed 1 year ago

NISH1001 commented 1 year ago

Describe the bug I was running RobertaForQuestionAnswering on HuggingFace's squad-v2 train sets (~86k). The Accuracy metric at AccuracyForLanguageGeneration._compute_single_pred_single_ref threw division by zero error.

image

To Reproduce

Expected behavior Run without error.

Exception Traceback (if available) If applicable, add full traceback to help explain your problem.

ration.py:107, in AccuracyForLanguageGeneration._compute_single_pred_single_ref(self, predictions, references, reduce_fn, **kwargs)
    105         if token in ref_counts:
    106             score += min(pred_count, ref_counts[token])  # Intersection count
--> 107     scores.append(score / max(len(pred), len(ref)))
    108 avg_score = sum(scores) / len(scores)
    109 return {"score": avg_score}

ZeroDivisionError: division by zero

Environment Information:


Thanks. Appreciate jury to exist. I could patch this by cloning and doing in-depth trace analysis. But, I wanted to know if there's a better way to patch this.

NISH1001 commented 1 year ago

Re: I found the issue. It's during the AccuracyForLanguageGeneration._tokenize(...) process which is stripping off some texts. such as when both the predictions and references are just literal '$':

image
NISH1001 commented 1 year ago

Re: I was able to patch it under try/catch block here: https://github.com/NISH1001/jury/commit/6bdf6800f7498b970fa8a81f3ea4bf88f6a12c32

Should I send a PR? I don't know if need to just throw a warning or also show the original <value> for either of the pred/ref.

devrimcavusoglu commented 1 year ago

Hi @NISH1001, thanks for the heads-up, and also thanks for your comments, it is appreciated. I'll look into the PR asap.