Closed AshishSardana closed 4 years ago
Hi Ashish, Our paper based on the work is under review. We will release the corpus on acceptance. We are expecting this to be around September. You can try using the Oscar Corpus for the languages you mentioned (https://traces1.inria.fr/oscar). Regards, Anoop.
Thank you Anoop, I wish you the best! This dataset isn't mentioned in the indicnlp_catalog GitHub repo, you might want to hyperlink it their as well.
Thanks for pointing out this oversight, I will add it to the repo.
What would be an approximate timeline for the release of raw text corpora? Also, can you point me to other resources from where I can get free text corpora for Gujarati, Tamil, Telugu and Marathi?