xiaoyifang / goldendict-ng

The Next Generation GoldenDict
https://xiaoyifang.github.io/goldendict-ng/
Other
1.64k stars 90 forks source link

Does it support word-splitting for Chinese? #1466

Closed Aulline closed 6 months ago

Aulline commented 6 months ago

First of all, thank you for reviving this great software!

I have a question: can it split words for Chinese when searching dictionary entries? By this I mean that typing a sentence (for example, "今天天气很好") in the searchbar will return links to individual words in that sentence ("今天", "天气" etc), clicking on which will get the user to the existing dictionary entry for each word. Currently, it either returns nothing or suggests the closest entries available. Abbyy Lingvo does this when the sentence cannot be found in dictionary entries/examples etc.

Details

![image](https://github.com/xiaoyifang/goldendict-ng/assets/108539217/374a792c-5656-49f4-beda-2f2ad10c6e20) ![image](https://github.com/xiaoyifang/goldendict-ng/assets/108539217/0efef7da-23d0-4503-a187-6e741e4692c8)

shenlebantongying commented 6 months ago

No. Doing that requires a Chinese part-of-speech analyzer.

However, it is possible to use one inside GD (but they requires some scripting knowledge).

For example, this Japanese sentence segmenter for GD https://github.com/Ajatt-Tools/gd-tools?tab=readme-ov-file#gd-mecab

Another example using Python for German sentence segmentation https://xiaoyifang.github.io/goldendict-ng/howto/how%20to%20add%20a%20program%20as%20dictionary/

atauzki commented 6 months ago

Chinese word segmenter: https://github.com/fxsjy/jieba