AIPHES / emnlp19-moverscore

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
MIT License
197 stars 34 forks source link

Multi reference MoverScore #20

Closed amberhuang01 closed 3 years ago

amberhuang01 commented 3 years ago

Hi,

Thanks for releasing the code.

I'm wondering how MoverScore is calculated when there are multiple references? Is it taking an average? Or is it taking the max?

To give you a concrete example:

  1. On a sentence level looks like it is taking an average image
  2. On a paragraph level looks like it is neither max nor average image

Can you provide some guidance on this? Thanks in advance.

Regards, Amber

andyweizhao commented 3 years ago

I took the averaged moverscore over multiple references. Thus, in your 2nd example the results from [26] should be equal to the ones from running multiple references as a whole. Apologies for a silly mistake, for which I've fixed straight away.