Open houshuang opened 11 years ago
Works for me (my locale is UTF-8)
require 'ffi/hunspell'
dict = FFI::Hunspell.dict('ru_RU')
dict.valid? "рассчитывал"
#=> true
dict.encoding
#=> #<Encoding:UTF-8>
dict.stem "рассчитывал"
#=> ["рассчитывать"]
@houshuang what does __ENCODING__
return in irb
? What is the output of the locale
command?
Yeah, this is encoding problems:
On Ubuntu 17.04 (hunspell 1.4.1-2build1):
dict = FFI::Hunspell.dict('ru_RU')
dict.encoding
# => #<Encoding:KOI8-R (autoload)>
dict.suggest('ощибка')
# => []
dict.suggest('ощибка'.encode(dict.encoding)).map { |s| s.encode(__ENCODING__) }
# => ["ощипка", "ошибка"]
Can't get this to work, not sure if it's a UTF8 issue or what.
require 'ffi/hunspell' c= FFI::Hunspell.dict('ru_RU') p c.stem("рассчитывал") #-> []
command line using hunspell binary: textmining|master⚡ ⇒ echo рассчитывал | hunspell -d ru_RU -s рассчитывал рассчитывать