serenayj / ABCD-ACL2021

ABCD: A Graph Framework to Convert Complex Sentences to a Covering Set of Simple Sentences
MIT License
28 stars 3 forks source link

Problem of evaluating with bleu metric #5

Open yangjingyi opened 3 years ago

yangjingyi commented 3 years ago

Hi,

In function EvalStrs(pred_strs, golds) in utils.py, I am sure if the use of bleu_score(candidate, references) is correct. I check torch text document, the inputs for bleu_score should be an iterable of candidate translations and an iterable of iterables of reference translations. But in current code, the inputs are like [['a','b'], ['c', 'd']] and [['e','f'], ['g', 'h']]. Then I test with two same inputs, the bleu score is 0. It seems that there are some problems. Is the inputs for bleu_score is correct when evaluating? Or there are some problems on my understanding?

Thanks.