pytorch / torcheval

A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to facilitate metric computation in distributed training and tools for PyTorch model evaluations.
https://pytorch.org/torcheval
Other
211 stars 46 forks source link

fix: correct reference length calculation #195

Closed yuxqiu closed 3 months ago

yuxqiu commented 4 months ago

Summary

This PR fixes the way brevity penalty (specifically the effective reference corpus length) is calculated in BLEU.

Previously, len_reference was calculated as min([len(ref) for ref in references_tokenized]). However, this is incorrect, because according to the paper, we need to find the "best match length", not the minimum reference length.

For more information, see wikipedia - brevity penalty and nltk implementation.

Test plan

I added another unit test to test_bleu.py and compared the results of the calculations to the results of the nltk.translate.bleu_score.corpus_bleu function to make sure the implementation is correct.

facebook-github-bot commented 4 months ago

@JKSenthil has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

yuxqiu commented 4 months ago

Hi, I am wondering if there is a way for me to get the reason why the test is failing so that I can fix the problem.

JKSenthil commented 4 months ago

Hi @yuxqiu, thanks for this contribution! it seems some files unrelated to BLEU have been formatted in a way in which causes our linter to error, do you mind undo-ing those changes?

yuxqiu commented 4 months ago

@JKSenthil I've finished undo-ing all those changes.

facebook-github-bot commented 4 months ago

@JKSenthil has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

JKSenthil commented 4 months ago

Hi @yuxqiu, thanks for reverting! We have identified the linter issue to be on our end, we'll land a fix first then rerun these tests again :)

facebook-github-bot commented 3 months ago

@JKSenthil has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.