However, what about lemmatization? There seems to be no information about parts of speech, unless I'm missing it. Are words counted without considering this? If so, then words like English "record" [verb] would be counted together with "record" [noun], and words with many different inflected forms would have each individual form counted separately, giving no overview of how popular the lemma itself is. Is any functionality built in or planned to take lemmata into consideration?
Nice work on this project, it's amazing!
However, what about lemmatization? There seems to be no information about parts of speech, unless I'm missing it. Are words counted without considering this? If so, then words like English "record" [verb] would be counted together with "record" [noun], and words with many different inflected forms would have each individual form counted separately, giving no overview of how popular the lemma itself is. Is any functionality built in or planned to take lemmata into consideration?