JuliaText / TextAnalysis.jl

Julia package for text analysis
Other
374 stars 96 forks source link

Fix language in stemmer type specification #115

Closed nickto closed 5 years ago

nickto commented 5 years ago

The language passed to the stemmer is inferred using name(language(d)) where d is an AbstractDocument. This is a language name in that language, e.g., "русский" for Russian. Snowball stemmer, however, requires an English name or an ISO code:

The algorithm may be selected using the english name of the language, or using the 2 or 3 letter ISO 639 language codes.

This fix infers English language name.

(Tested only on Russian).