michmech / lemmatization-lists

Machine-readable lists of lemma-token pairs in 23 languages.
Open Data Commons Open Database License v1.0
333 stars 94 forks source link

Add lemmas for more languages using wiktionary #8

Open EmilStenstrom opened 5 years ago

EmilStenstrom commented 5 years ago

Would it be possible to add more langauges by parsing the wiktionary dumps?

EmilStenstrom commented 5 years ago

Here's example code of how to parse wiktionary: https://github.com/kaizinho/lemmatizer