facebookresearch / SentEval

A python tool for evaluating the quality of sentence embeddings.
Other
2.09k stars 309 forks source link

Added NaN check to STS evaluations #7

Closed tscheepers closed 7 years ago

tscheepers commented 7 years ago

If the parameters of this function receive a zero vector, the similarity will return a NaN. This can happen if a sentence is passed to the batcher but the batcher has no embeddings for the words in the sentence (https://github.com/facebookresearch/SentEval/blob/master/examples/bow.py#L58).

One sentence being a zero vector will result in nan results for all.spearman.mean.

Now this is fixed, new example result:

2017-08-03 15:13:35,590 : ***** Transfer task : STS16 *****
2017-08-03 15:13:35,652 : answer-answer : pearson = 0.1550, spearman = 0.1854
2017-08-03 15:13:35,676 : headlines : pearson = 0.2466, spearman = 0.3175
2017-08-03 15:13:35,698 : plagiarism : pearson = 0.5306, spearman = 0.6027
2017-08-03 15:13:35,722 : postediting : pearson = 0.5524, spearman = 0.6829
2017-08-03 15:13:35,742 : question-question : pearson = 0.2708, spearman = 0.2527
2017-08-03 15:13:35,742 : ALL (weighted average) : Pearson = 0.3493,             Spearman = 0.4083
2017-08-03 15:13:35,742 : ALL (average) : Pearson = 0.3511,             Spearman = 0.4082