go-ego / gse

Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others.
Apache License 2.0
2.57k stars 215 forks source link

Found a bug in file dict_util.go #173

Open ntcat opened 1 year ago

ntcat commented 1 year ago

in this func: func (seg Segmenter) Reader(reader bufio.Reader, files ...string) error

those code lines: if fsErr != nil { if fsErr == io.EOF { ... } must put after : ... seg.Dict.AddToken(token) Otherwise,last line of dictory file will be missed.exclude the last line is empty.

ntcat commented 1 year ago

and, change this line:

        if freq == 0.0  {
            continue
        }

to:

        if freq == 0.0 && fsErr != io.EOF {
            continue
        }

otherwise, will met dead loop, when dictory file is empty.