yanyiwu / nodejieba

"结巴"中文分词的Node.js版本
MIT License
3.04k stars 278 forks source link

不同語境下 判斷相同詞彙的不同詞性 #171

Closed kanasimi closed 1 month ago

kanasimi commented 3 years ago

养一只小猫 从此连只苍蝇都进不来, 这还是只开始

前兩個”只”應該都是量詞q 但是卻都判斷成為副詞d

不曉得有沒有辦法支援相同詞彙多詞性的功能?

related: https://github.com/fxsjy/jieba/issues/832

http://blog.pulipuli.info/2017/11/fasttag-identify-part-of-speech-in.html

Jieba在詞性上最大的問題在於一個詞僅能有個詞性,且這個詞性是字典給什麼就是什麼,

kanasimi commented 3 years ago

理想的方法是依句型分析、句子的結構樹判斷所有可能的詞性,挑出可能性最大的。 related: 中文句結構樹資料庫 http://treebank.sinica.edu.tw/ 補救的方法是創建辭典,依前後文的詞性修改詞彙的詞性。例如後面接名詞,尤其是動物,那使用”隻”的機會就比較高。這邊正嘗試使用補救方法。 https://github.com/kanasimi/Chinese_converter

github-actions[bot] commented 1 month ago

This issue has not been updated for over 3 years and will be marked as stale. If the issue still exists, please comment or update the issue, otherwise it will be closed after 7 days.

github-actions[bot] commented 1 month ago

This issue has been automatically closed due to inactivity. If the issue still exists, please reopen it.