AI4Bharat / indicnlp_catalog

A collaborative catalog of NLP resources for Indic languages
https://ai4bharat.github.io/indicnlp_catalog
552 stars 79 forks source link

Corpora for extremely low-resource languages #174

Open anoopkunchukuttan opened 2 years ago

anoopkunchukuttan commented 2 years ago

Check Table 1 in this paper

From - Prasad, Manasa, Theresa Breiner, and Daan van Esch. "Mining Training Data for Language Modeling Across the World's Languages." SLTU. 2018.

image