mjpost / sacrebleu

Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
Apache License 2.0
1.06k stars 162 forks source link

Add BLEU max_ngram_order to signature #251

Open BramVanroy opened 10 months ago

BramVanroy commented 10 months ago

Currently, when you calculate BLEU with different max_ngram_order's and everything the same, they will have the same signature when you use bleu_metric.get_signature().format(short=True). Something like #:1|c:mixed|e:no|tok:13a|s:exp|v:2.3.1. Should an argument be added to the signature to specify the max ngram order, like with ChrF where both nc and nw are specified?

If you agree I can do a PR.

martinpopel commented 10 months ago

The original chrF papers report results with different n-gram orders and also other researchers have tried (and reported) chrF with different orders and I think no order has been actually selected as the default in the papers, so it is natural that nc and nw are part of the signature. (The very first chrF paper mentions that "The best correlations are obtained for 6-gram", but the correlations are not shown.) However, the original BLEU paper reports BLEU scores only with max ngram order=N=4, which has been considered the default/standard value for BLEU since then. (The paper reports n-gram precisions for N=1...4 in Figure 2, but not the final BLEU nor its correlation with humans. That is reported only for N=4.) So I would suggest to keep not reporting max_ngram_order in the BLEU signature if the value is the default N=4. That said, I have nothing against adding max_ngram_order into the signature if the value is different. What about @mjpost and @ozancaglayan?

ozancaglayan commented 3 months ago

I think if the value is different, it could be added as you suggested. So if the value is not changed, at least the signatures are backwards-compatible