dhowe / rita

Website, documentation and examples for RiTa
https://rednoise.org/rita
71 stars 9 forks source link

Incorrect stems for adjectives derived by adding "-y" #140

Closed shadoof closed 3 years ago

shadoof commented 3 years ago
stem("watery") -> "wateri" should be "water"
stem("funny") -> "funny" should be "fun"
stem("slinky") -> "slinki" // although this is derived from a verb I guess; should be "slink"
stem("skinny") -> "skinni" should be "skin"

I would say that the stem should be the stem of the noun or verb from which the adjective is derived. Algorithm should of course check for final "-ly" which should be handled differently (as an adverb generator).

dhowe commented 3 years ago

so what are the expected stems here?

https://en.wikipedia.org/wiki/Word_stem

shadoof commented 3 years ago

editing original comment in reply above

dhowe commented 3 years ago

So we are using a custom version of the Snowball (or Porter2) stemmer, for which these are actually the correct stems. You can check words quickly here: https://snowballstem.org/demo.html