go-ego / gse

Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.
Apache License 2.0
2.57k stars 215 forks source link

English cut bug #170

Closed Zrzzzz closed 1 year ago

Zrzzzz commented 1 year ago
seg.LoadDict("zh")
seg.LoadDict("en")
seg.LoadDict("jp")
seg.LoadStop("zh")
logrus.Debugln(seg.CutSearch("Nowadays, there are more and more misunderstanding between parents and children which is so- called generation gap. It is estimated that ( 75 percentages of parents often complain their children’s unreasonable behavior while children usually think their parents too old fashioned )."))
[nowadays ,   there   are   more   and   more   misunderstanding   between   parents   and   children   which   is   so -   called   generation   gap .   it   is   estimated   that   (   75   percentages   of   parents   often   complain   their   children ’ s   unreasonable   behavior   while   children   usually   think   their   parents   too   old   fashioned   ) .]

Description

I read the data/en/dict.txt and find it empty. However, it seems like gse doesn't support english text cutting.

vcaesar commented 1 year ago

Are you read the README.md?

Zrzzzz commented 1 year ago

no english dict provided? I only see custom english dict. And it's said that it support english cutting?

nitezs commented 1 year ago

no english dict provided? I only see custom english dict. And it's said that it support english cutting?

I'm also looking for a way to use gse for english word segmentation. Have you found an English dictionary yet?