chatopera / Synonyms

:herb: 中文近义词:聊天机器人,智能问答工具包
https://bot.chatopera.com/
Other
5.02k stars 904 forks source link

enhance Synonyms#compare #6

Closed hailiang-wang closed 6 years ago

hailiang-wang commented 6 years ago

description

use advanced method to compute similarity for two sentence.

related with #4

solution

leverage adv distance measurements.

hailiang-wang commented 6 years ago

distance 计算方式比较

image

image

image

From https://lyfat.wordpress.com/2012/05/22/euclidean-vs-chebyshev-vs-manhattan-distance/

huyingxi commented 6 years ago

Thanks . I have tested on several different similarity measures, and found that Euclidean distance is the best. And I merged some other features such as synonys' word embeddings , unigram overlap, POS tag ...

hailiang-wang commented 6 years ago

Commit c0a477555174baa5f5c03243c3bea79aa206b8e9

huyingxi commented 6 years ago

roc_curve_eulidean0 8 unigram0 2 pos I just test on the 778 pairs of sentences (link: https://github.com/fssqawj/SentenceSim/blob/master/dev.txt)

and get these curves: pr_hresholds_eulidean0 8 unigram0 2 pos