Lemmatization in German

mcallaghan / text-as-data

13 stars 20 forks source link

Open OptimisticSnail opened 1 year ago

OptimisticSnail commented 1 year ago

Hello!

I am trying to lemmatize my German language tokens - any hints on how I could do so? E.g. packages to use (optimally in combination with quanteda)?

I'd greatly appreciate any help!

Sonja

mcallaghan commented 1 year ago

The example they give using tokens_wordstem from quanteda looks promising.

texts %>% tokens() %>% tokens_wordstem(language="de")

I have not tested this, so would appreciate any feedback on whether it works.