Closed aaaton closed 5 years ago
we can make it outuput the shortest or the first in alphabetical order
Both suggestions sound reasonable, just to get rid of the unpredictability.
Another solution would be to respond with the most likely lemmatization, but that requires a minimum of TFIDF, and that might be a little bit out of scope for this package. I'm not sure how that would adapt to different language domains either.
Solved in v2.0. Golem now always returns the first alphabetical result in case of multiple to choose from.
If you are reading this and want the "correct" lemmatization I suggest getting all possible results from golem.Lemmas(word string) []string
and implement a better guess yourself based on the context or corpus you are working with.
If we have a word with multiple options for how it should be lemmatized, the behaviour is undefined