DataKind-BLR / PrathamBooks-Sprint-2018

Code and documentation for the collaboration with PrathamBooks during Sprint' 2018
MIT License
4 stars 7 forks source link

Create more text corpus #27

Open arnabbiswas1 opened 6 years ago

arnabbiswas1 commented 6 years ago

Since we are dealing with children story books, we need to be careful while selecting corpus. Following is the source (content in md format) for few other portal who publishes story books under open licensing. Potentially those can be used to generate corpus :

https://github.com/DataKind-BLR/PrathamBooks-Sprint-2018/wiki/Other-Resources-For-Corpus