hankcs / HanLP

Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification
https://hanlp.hankcs.com/en/
Apache License 2.0
33.83k stars 10.12k forks source link

请问这个项目和哈工大ltp或者复旦的那个相比,主要区别是什么? #56

Closed daqulazhang closed 9 years ago

daqulazhang commented 9 years ago

如题,抱歉一下子读不完所有代码就直接请教一下了:)

hankcs commented 9 years ago

有些项目是服务于生产的,有些项目是服务于论文的,这会导致它们在代码质量、易用性、运行效率、内存开销上产生巨大差别。只有亲自试一试,你才能知道哪个更适合你。

daqulazhang commented 9 years ago

好,确实项目不同和年代会导致一些区别,请问hanlp在设计目标上是为了什么?为了一个商用质量的,还是为了兴趣爱好?

hankcs commented 9 years ago

最开始的时候是商业目的,离职后变成了我的爱好。

daqulazhang commented 9 years ago

能否加个好友?貌似github不能私信了:( 我的邮箱是daqula@zhaitech.com ,更详细的我们具体聊好不

hankcs commented 9 years ago

好的。

mibdennis commented 6 years ago

请问方便加个好友交流吗,我邮箱deny.tsai@gmail.com 谢谢

hankcs commented 6 years ago

Hi,我正在海外留学。因为时差,精力和个人喜好问题,不怎么用即时通讯手段。公事的话请在GitHub上发issue,私事请发邮件(地址在每份最新代码顶端)。

mibdennis commented 6 years ago

您好,

请问HanLP汉语转拼音的词库来源是哪些? https://github.com/hankcs/HanLP/blob/master/data/dictionary/pinyin/pinyin.txt

似乎包含了pinyin4j https://github.com/belerweb/pinyin4j/blob/master/src/main/resources/pinyindb/multi_pinyin.txt

但其它来源是哪些呢?

谢谢!

2018-08-07 14:56 GMT-04:00 hankcs notifications@github.com:

Hi,我正在海外留学。因为时差,精力和个人喜好问题,不怎么用即时通讯手段。公事的话请在GitHub上发issue,私事请发邮件( 地址在每份最新代码顶端)。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-411164213, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vkYYjIS10CMCEm5F2YPajJ9Rqgziks5uOeL1gaJpZM4F68Zo .

hankcs commented 6 years ago

请参考 http://www.hankcs.com/nlp/java-chinese-characters-to-pinyin-and-simplified-conversion-realization.html

另外, 也有一部分是 @AnyListen 的贡献。

mibdennis commented 6 years ago

谢谢! 我目前也在美国读master,想引用hanlp做一个商业短语方面的推荐系统小项目,例如一些公司名称之类,测试的时候对比标准的hanlp统计分词和机器学习训练对输入关键词的分词,发现标准分词对一般词汇的分词更准确,但一些商标名称分词不准,后者相反,商标分词较准,一般词汇准确度较低,如物联网被断为物,联网。请问有没有相关词库或者方法帮助提升这方面的准确度呢?

On Mon, Aug 13, 2018, 22:22 hankcs notifications@github.com wrote:

请参考 http://www.hankcs.com/nlp/java-chinese-characters-to-pinyin-and-simplified-conversion-realization.html

另外, 也有一部分是 @AnyListen https://github.com/AnyListen 的贡献。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-412731682, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vmhRJqUdZ8NPc5CPNHdipyQdiXsHks5uQjRKgaJpZM4F68Zo .

hankcs commented 6 years ago

请参考:https://github.com/hankcs/HanLP/issues/884#issuecomment-405066582 另外,如果你讨论的话题与本issue无关,请新开一个,以免打扰到本issue的 subscribe r。

mibdennis commented 6 years ago

感谢🙏!信息非常有帮助。请问您openCorpus项目的目前20M语料库能够公开获取吗?

On Tue, Aug 14, 2018 at 8:36 AM, hankcs notifications@github.com wrote:

请参考:#884 (comment) https://github.com/hankcs/HanLP/issues/884#issuecomment-405066582 另外,如果你讨论的话题与本issue无关,请新开一个,以免打扰到本issue的 subscribe r。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-412856643, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vtiuZvqGvKvaoNLzlTMaTdeqJt5Jks5uQsRpgaJpZM4F68Zo .

mibdennis commented 6 years ago

问个个人点的问题:) 请问你目前是在读PhD吗?如有冒犯,请勿回答,谢谢

On Tue, Aug 14, 2018 at 11:54 AM, Pengfei Cai dennispfcai@gmail.com wrote:

感谢🙏!信息非常有帮助。请问您openCorpus项目的目前20M语料库能够公开获取吗?

On Tue, Aug 14, 2018 at 8:36 AM, hankcs notifications@github.com wrote:

请参考:#884 (comment) https://github.com/hankcs/HanLP/issues/884#issuecomment-405066582 另外,如果你讨论的话题与本issue无关,请新开一个,以免打扰到本issue的 subscribe r。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-412856643, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vtiuZvqGvKvaoNLzlTMaTdeqJt5Jks5uQsRpgaJpZM4F68Zo .

ysjiao commented 4 years ago

您好,请问就依存分析而言和哈工大的LTP有什么区别。感谢!