goru001 / inltk

Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
https://inltk.readthedocs.io
MIT License
824 stars 163 forks source link

Contribution for Gujarati-English #87

Open Darshan2104 opened 2 years ago

Darshan2104 commented 2 years ago

Hey there, I want to contribute to the inltk. Is there any way to do it for Gujlish? Can you guide me on how it can be possible? As per my knowledge, We need to create a tokenizer and train model on Gujlish dataset. But the point is how do I get the data?!?! Also, guide me if I'm missing something

Thanks :)