stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.71k stars 2.7k forks source link

自定义词典 #1462

Closed dashu101 closed 3 months ago

dashu101 commented 3 months ago

What is the txt file format for custom dictionaries?

AngledLuffa commented 3 months ago

one word per line

https://github.com/stanfordnlp/CoreNLP/blob/375f24338c09b22d1596440864bc074f32c0feb9/src/edu/stanford/nlp/wordseg/ChineseDictionary.java#L158

dashu101 commented 3 months ago

Thanks I have solved it. Looking at the source code, I found that the custom words cannot exceed 6 words.

发自我的iPhone

------------------ 原始邮件 ------------------ 发件人: John Bauer @.> 发送时间: 2024年8月27日 23:01 收件人: stanfordnlp/CoreNLP @.> 抄送: dashu101 @.>, Author @.> 主题: Re: [stanfordnlp/CoreNLP] 自定义词典 (Issue #1462)

one word per line

https://github.com/stanfordnlp/CoreNLP/blob/375f24338c09b22d1596440864bc074f32c0feb9/src/edu/stanford/nlp/wordseg/ChineseDictionary.java#L158

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>