不知作者有没有对lecene匹配相似度算法有好的建议 (交流性)

zhangmt / jcseg

Automatically exported from code.google.com/p/jcseg

0 stars 0 forks source link

不知作者有没有对lecene匹配相似度算法有好的建议 (交流性) #3

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago

首先感谢作者辛苦的劳作，写出这么好的分词工具；我应用��
�项目中，效果很好，感谢；
不知作者有没有对使用lecene匹配相似度有好的建议
例如：a=中国XX b=天朝XX
a和b的匹配相似度应该是一样的，但实际应用起来“中国=天��
�”或者“单车=自行车=脚踏车”这种同义词收集得不足够多��
�会导致相似度判断误差
以上是基于同义词的匹配，对于“概念匹配”（百度google也��
�该叫智能匹配？）这种算法有什么想法，希望大家可以交流�
��

Original issue reported on code.google.com by byrss...@gmail.com on 8 Feb 2013 at 5:27

GoogleCodeExporter commented 9 years ago

回家过年了，家里没有网络，很抱歉这么晚才给你回复。

Original comment by chenxin6...@gmail.com on 12 Feb 2013 at 11:18

GoogleCodeExporter commented 9 years ago

你说道“概念匹配”我就会想起百度的“智能匹配”广告服��
�，效果还的有待提高。正如你说的：对于倒排文档索引结构�
��检索系统，同义词能够达到相同的相关度，更大程度的收集
同义词确实是一个很好的方法（我曾想集成《中华同义词词��
�》中所有的词条到jcseg词库中）。至于“概念匹配”算法我��
�曾涉及（非我的研究方向），更谈不上和你们交流了，呵呵�
��…，要是楼主得到相关资料了，请给哥们发一份，咱也来学
习下。

Original comment by chenxin6...@gmail.com on 12 Feb 2013 at 11:23

GoogleCodeExporter commented 9 years ago

另外，很高兴jcseg能给你的项目带来便利，感谢夸奖，O(∩_∩
)O~

Original comment by chenxin6...@gmail.com on 12 Feb 2013 at 11:25

GoogleCodeExporter commented 9 years ago

哈，新年快乐
关于概念匹配，唉，理论多于实际，需要一个完善的过程
正在研究楼主的分词算法，发现几个小疑问，有时间再讨论

Original comment by byrss...@gmail.com on 24 Feb 2013 at 2:54

GoogleCodeExporter commented 9 years ago

Original comment by chenxin6...@gmail.com on 10 May 2013 at 5:11

Changed title: 不知作者有没有对lecene匹配相似度算法有好的建议 (交流性)