Why the EMD value from the emd_with_flow function is NOT directly used but applying "np.sum(flow * dst)" to recompute?

AIPHES / emnlp19-moverscore

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

MIT License

197 stars 34 forks source link

Why the EMD value from the emd_with_flow function is NOT directly used but applying "np.sum(flow * dst)" to recompute? #11

Closed Chen-Wang-CUHK closed 4 years ago

Chen-Wang-CUHK commented 4 years ago

Hi, Thank you for your great job! I am curious that why the EMD value from the emd_with_flow function is NOT directly used but applying "np.sum(flow * dst)" to recompute? As shown in line 166-168 in moverscore_v2.py:

             _, flow = emd_with_flow(c1, c2, dst)
            flow = np.array(flow, dtype=np.float32)
            score = 1 - np.sum(flow * dst)

Is there any special reason to recompute the EMD value by "np.sum(flow * dst)"? Thank you.

andyweizhao commented 4 years ago

Hi Chen,

I was trying to clean the flow matrix via masking some values, but my trial was not very successful.

Wei