In function EvalStrs(pred_strs, golds) in utils.py, I am sure if the use of bleu_score(candidate, references) is correct. I check torch text document, the inputs for bleu_score should be an iterable of candidate translations and an iterable of iterables of reference translations.
But in current code, the inputs are like [['a','b'], ['c', 'd']] and [['e','f'], ['g', 'h']]. Then I test with two same inputs, the bleu score is 0. It seems that there are some problems. Is the inputs for bleu_score is correct when evaluating? Or there are some problems on my understanding?
Hi,
In function EvalStrs(pred_strs, golds) in utils.py, I am sure if the use of bleu_score(candidate, references) is correct. I check torch text document, the inputs for bleu_score should be an iterable of candidate translations and an iterable of iterables of reference translations. But in current code, the inputs are like [['a','b'], ['c', 'd']] and [['e','f'], ['g', 'h']]. Then I test with two same inputs, the bleu score is 0. It seems that there are some problems. Is the inputs for bleu_score is correct when evaluating? Or there are some problems on my understanding?
Thanks.