I am trying to obtain the semantic similarity between the generated and the ground truth sentence.
I used all these metrics to evaluate the generated sentences (validation dataset):
BLEU 1 | 0.128031
BLEU 2 | 0.056153
BLEU 3 | 0.029837
BLEU 4 | 0.013649
METEOR | 0.305482
ROUGE_L | 0.148652
CIDEr | 0.069519
SkipThought cosine similarity | 0.765784
Embedding Average cosine similarity | 0.973187
Vector Extrema cosine similarity | 0.683888
Greedy Matching score | 0.94496
Some of these metrics indicates the sentences to be quite similar and some shows sentences to be different. Can you please suggest a metric to obtain the semantic similarity between sentences.
How about the Infersent and word mover's distance? I think you should consider adding these metrics for evaluation of text generation. This repository is helpful for evaluation of generated text.
Hi,
I am trying to obtain the semantic similarity between the generated and the ground truth sentence.
I used all these metrics to evaluate the generated sentences (validation dataset): BLEU 1 | 0.128031 BLEU 2 | 0.056153 BLEU 3 | 0.029837 BLEU 4 | 0.013649 METEOR | 0.305482 ROUGE_L | 0.148652 CIDEr | 0.069519 SkipThought cosine similarity | 0.765784 Embedding Average cosine similarity | 0.973187 Vector Extrema cosine similarity | 0.683888 Greedy Matching score | 0.94496
Some of these metrics indicates the sentences to be quite similar and some shows sentences to be different. Can you please suggest a metric to obtain the semantic similarity between sentences.
How about the Infersent and word mover's distance? I think you should consider adding these metrics for evaluation of text generation. This repository is helpful for evaluation of generated text.