Closed ripperdoc closed 8 years ago
When I use unicode input, all keywords get cutoff at that unicode. I found the reason. In nlp.rb, line 102:
set :word_pattern, /(?<!@)(?<=\s)[\w']+/
should be
set :word_pattern, /(?<!@)(?<=\s)[\p{Word}']+/
This still seams to be an issue. German Umlauts ä, ö, ü and ß still cause these problems.
When I use unicode input, all keywords get cutoff at that unicode. I found the reason. In nlp.rb, line 102:
should be