Open avinsit123 opened 4 years ago
@avinsit123 Thanks for reaching out. It would be great to integrate your work into the iNLTK library.
In order to add support for Hindi NER, it would be great if you can:
Let me know what you think.
@goru001 will mail you the required stuff mentioned above once we have completed the refining model. Currently we have trained our model using several embeddings for eg: fasttext, roberta , etc. using flair's NLP Library. It would be also great to add support in inltk so that users to custom train their NER models.
@avinsit123 Sure, will wait for your mail. Thanks!
@avinsit123 Do you have any resources where I can get similar NER dataset for tamil ?
@avinsit123 How about using word level inltk embedding and then xgboost to classify the tokens?
Currently we are working on research project for NER in Hindi. We would like to extend our code and work to add Support for Hindi-NER in NLTK. Our current model(Embeddings->LSTM->CRF) is trained on this dataset http://ltrc.iiit.ac.in/ner-ssea-08/index.cgi?topic=2 with 14 tags and has an accuracy around 70%. We are trying to increase the accuracy of model currently. Do you have any contribution guidelines to the project or any specifics which u would like in the NER model? Otherwise, we are really interested to contribute to the project.