Closed BrightXiaoHan closed 1 year ago
@BrightXiaoHan Thanks for creating this issue. If this is urgent, could you please update this link with v0.3 (or newest) from https://ai4bharat.iitm.ac.in/samanantar https://github.com/thammegowda/mtdata/blob/c57dab559e05e80ccbcb26fc44bb7fc94d676ef2/mtdata/index/ai4bharat.py#L17
and test if works! Thanks
Thanks for your reply. I will try to test it.
Samanantar is the largest publicly available parallel corpora collection for Indic languages: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu. The corpus has 49.6M sentence pairs between English to Indian Languages.
https://ai4bharat.iitm.ac.in/samanantar