wipfli / word-corpus

Scripts that extract a word corpus from OpenStreetMap, Wikipedia, and Wikidata targeting South-East Asian and Indic languages.
MIT License
5 stars 2 forks source link