Closed AlexGrinch closed 5 years ago
In general: Sacrebleu is the number obtained through this script: https://github.com/mjpost/sacreBLEU And BLEU is from this one: https://github.com/moses-smt/mosesdecoder/blob/master/scripts/generic/multi-bleu.perl
You can read about the difference between them here: https://arxiv.org/abs/1804.08771
I'm not sure why you see the ridiculously high BLEU, but maybe you don't have --remove-bpe?
Hello!
Could you, please, elaborate on the difference between BLEU and SacreBLEU scores reported in the Fairseq paper? How can I calculate SacreBLEU, for example, for the output of DynamicConv model? I can reproduce 29.7 BLEU with fairseq-score but when I run fairseq-score with flag --sacrebleu, I get ridiculously high score of 33.8.
Thanks