请问这个项目和哈工大ltp或者复旦的那个相比，主要区别是什么？

hankcs / HanLP

Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification

https://hanlp.hankcs.com/en/

Apache License 2.0

33.83k stars 10.12k forks source link

请问这个项目和哈工大ltp或者复旦的那个相比，主要区别是什么？ #56

Closed daqulazhang closed 9 years ago

daqulazhang commented 9 years ago

如题，抱歉一下子读不完所有代码就直接请教一下了：）

hankcs commented 9 years ago

有些项目是服务于生产的，有些项目是服务于论文的，这会导致它们在代码质量、易用性、运行效率、内存开销上产生巨大差别。只有亲自试一试，你才能知道哪个更适合你。

daqulazhang commented 9 years ago

好，确实项目不同和年代会导致一些区别，请问hanlp在设计目标上是为了什么？为了一个商用质量的，还是为了兴趣爱好？

hankcs commented 9 years ago

最开始的时候是商业目的，离职后变成了我的爱好。

daqulazhang commented 9 years ago

能否加个好友？貌似github不能私信了：（我的邮箱是daqula@zhaitech.com ，更详细的我们具体聊好不

hankcs commented 9 years ago

好的。

mibdennis commented 6 years ago

请问方便加个好友交流吗，我邮箱deny.tsai@gmail.com 谢谢

hankcs commented 6 years ago

Hi，我正在海外留学。因为时差，精力和个人喜好问题，不怎么用即时通讯手段。公事的话请在GitHub上发issue，私事请发邮件（地址在每份最新代码顶端）。

mibdennis commented 6 years ago

您好，

请问HanLP汉语转拼音的词库来源是哪些？ https://github.com/hankcs/HanLP/blob/master/data/dictionary/pinyin/pinyin.txt

似乎包含了pinyin4j https://github.com/belerweb/pinyin4j/blob/master/src/main/resources/pinyindb/multi_pinyin.txt

但其它来源是哪些呢？

谢谢！

2018-08-07 14:56 GMT-04:00 hankcs notifications@github.com:

Hi，我正在海外留学。因为时差，精力和个人喜好问题，不怎么用即时通讯手段。公事的话请在GitHub上发issue，私事请发邮件（地址在每份最新代码顶端）。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-411164213, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vkYYjIS10CMCEm5F2YPajJ9Rqgziks5uOeL1gaJpZM4F68Zo .

hankcs commented 6 years ago

请参考 http://www.hankcs.com/nlp/java-chinese-characters-to-pinyin-and-simplified-conversion-realization.html

另外，也有一部分是 @AnyListen 的贡献。

mibdennis commented 6 years ago

谢谢！我目前也在美国读master，想引用hanlp做一个商业短语方面的推荐系统小项目，例如一些公司名称之类，测试的时候对比标准的hanlp统计分词和机器学习训练对输入关键词的分词，发现标准分词对一般词汇的分词更准确，但一些商标名称分词不准，后者相反，商标分词较准，一般词汇准确度较低，如物联网被断为物，联网。请问有没有相关词库或者方法帮助提升这方面的准确度呢？

On Mon, Aug 13, 2018, 22:22 hankcs notifications@github.com wrote:

请参考 http://www.hankcs.com/nlp/java-chinese-characters-to-pinyin-and-simplified-conversion-realization.html

另外，也有一部分是 @AnyListen https://github.com/AnyListen 的贡献。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-412731682, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vmhRJqUdZ8NPc5CPNHdipyQdiXsHks5uQjRKgaJpZM4F68Zo .

hankcs commented 6 years ago

请参考：https://github.com/hankcs/HanLP/issues/884#issuecomment-405066582 另外，如果你讨论的话题与本issue无关，请新开一个，以免打扰到本issue的 subscribe r。

mibdennis commented 6 years ago

感谢🙏！信息非常有帮助。请问您openCorpus项目的目前20M语料库能够公开获取吗？

On Tue, Aug 14, 2018 at 8:36 AM, hankcs notifications@github.com wrote:

请参考：#884 (comment) https://github.com/hankcs/HanLP/issues/884#issuecomment-405066582 另外，如果你讨论的话题与本issue无关，请新开一个，以免打扰到本issue的 subscribe r。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-412856643, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vtiuZvqGvKvaoNLzlTMaTdeqJt5Jks5uQsRpgaJpZM4F68Zo .

mibdennis commented 6 years ago

问个个人点的问题:) 请问你目前是在读PhD吗？如有冒犯，请勿回答，谢谢

On Tue, Aug 14, 2018 at 11:54 AM, Pengfei Cai dennispfcai@gmail.com wrote:

感谢🙏！信息非常有帮助。请问您openCorpus项目的目前20M语料库能够公开获取吗？

On Tue, Aug 14, 2018 at 8:36 AM, hankcs notifications@github.com wrote:

请参考：#884 (comment) https://github.com/hankcs/HanLP/issues/884#issuecomment-405066582 另外，如果你讨论的话题与本issue无关，请新开一个，以免打扰到本issue的 subscribe r。

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hankcs/HanLP/issues/56#issuecomment-412856643, or mute the thread https://github.com/notifications/unsubscribe-auth/Aet2vtiuZvqGvKvaoNLzlTMaTdeqJt5Jks5uQsRpgaJpZM4F68Zo .

ysjiao commented 4 years ago

您好，请问就依存分析而言和哈工大的LTP有什么区别。感谢！