tatuylonen / wiktextract

Wiktionary dump file parser and multilingual data extractor
Other
799 stars 82 forks source link

Simple english #824

Closed kristian-clausal closed 3 days ago

kristian-clausal commented 2 weeks ago

This is meant to be an "introductory" extractor, because of how small and reasonably consistent Simple English Wiktionary is. It has a simpler structure than usual, because there is not need to handle any language other than English.

kristian-clausal commented 2 weeks ago

To be done: tests, tags, all sorts of things.

kristian-clausal commented 2 weeks ago

I've copy-pasted some code from other extractors by @xxyzz, but mostly just Pydantic models.