Open ivory2406 opened 1 day ago
I want this can be set by params.
@vcaesar Could you help me with the option param toLower? thanks very much
@CocaineCong hello, Could you help me with the option param toLower? bacause i want to use this gse for tokenize sentences and then use mmh3 to encode tokens.
the character is lowercase or uppercase, it's very important to me. Because words mmh3 value are different when they are lowercase or uppercase.
hello, I want to keep uppercase letter。 like example:
the result is : ["hello"," ","world",","," ","helloworld","."," ","winter"," ","is"," ","coming","!"," ","你好","世界","."]
I hope the result is ["Hello"," ","world",","," ","Helloworld","."," ","Winter"," ","is"," ","coming","!"," ","你好","世界","."]
And I have seen the option params: https://github.com/go-ego/gse/blob/master/segmenter.go