IntelLabs / nlp-architect

A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
https://intellabs.github.io/nlp-architect
Apache License 2.0
2.94k stars 447 forks source link

How to improve bist model accuracy? :Bistmodel #188

Closed ravishpankar closed 3 years ago

ravishpankar commented 4 years ago

I'm using spacybistparser with my own spacy and bist model trained using universal dependencies v2.0 dataset. My training follows the doc. I want to improve the accuracy of bistmodel. Currently, with default parameters, I train the bist model for 20iterations. The evaluation uas score is 87.8 percent. How can I improve it to 93.8 as mentioned in the docs? Is it possible with this english ud 2.0 dataset? Please help.

ravishpankar commented 4 years ago

Any answer?

peteriz commented 3 years ago

@danielkorat, can you please help?

ravishpankar commented 3 years ago

@peteriz, thank you for requesting @danielkorat. This is very important to my project. I would like little lesser errors in parsing job descriptions. I guess improving the accuracy would result in fewer parsing errors. Please clarify this one too. I can always give it a try after improving the accuracy.

danielkorat commented 3 years ago

Hi @ravishpankar, sorry for the delayed response. 93.8 UAS is the score reported in the original paper and original code implementation. As mentioned there, this result is reported on the Penn TreeBank Dataset (Stanford Dependencies). Training on a different dataset might yield different results.

ravishpankar commented 3 years ago

Hi @danielkorat, Thanks a lot for answering. I understand that universal dependencies 2.0 en dataset is different from Penn tree bank dataset which is based upon Stanford dependency relations. Now the question is how did you train the bistmodel with Penn tree bank dataset to create the pretrained model for spacybistparser. Is it with the default parameters or something else? Any help will be greatly appreciated. Thank you.

danielkorat commented 3 years ago

Hi @ravishpankar From what I recall, I used the default parameters. I did not run many training runs, and don't have an evaluation score. Please refer to the two links in previous comment.