why get_ner_ontonotoes is like 6% better than get_ner_conll?

monajalal commented 6 years ago

Hi not an issue but I would like to know why I get an overall of 76.4\% accuracy on tweet entities for get_ner_conll and 82.15\% using get_ner_ontonotes? Is it something you expect to get better accuracy? What is the rationale behind this difference?

Also, if I am using the ontonotes model is it enough to cite your 'swiss army knife of nlp' paper or do I need to cite any other thing(?) that initially has invented OnToNotes? http://christos-c.com/papers/khashabi_18_cogcompnlp.pdf

@inproceedings{KhashabiSaZhre18,
  author = {Daniel Khashabi and Mark Sammons and Ben Zhou and Tom Redman and Christos Christodoulopoulos and Vivek Srikumar and Nicholas Rizzolo and Lev Ratinov and Guanheng Luo and Quang Do and Chen-Tse Tsai and Subhro Roy and Stephen Mayhew and Zhilli Feng and John Wieting and Xiaodong Yu and Yangqiu Song and Shashank Gupta and Shyam Upadhyay and Naveen Arivazhagan and Qiang Ning and Shaoshi Ling and Dan Roth},
  title = {CogcompNLP: Your Swiss Army Knife for NLP},
  booktitle = {LREC},
  year = {2018},
  url = {papers/cogcompnlp2018.pdf},
  code = {https://github.com/CogComp/cogcomp-nlp},
  pubtype = conf
}

Also, is this sentence correct or how would you correct it? For CogComp-NLP NER, we used Ontonotes 5.0 NER model.

Is this correct bibtex for OntoNotes 5.0 ? @article{weischedel2013ontonotes, title={Ontonotes release 5.0 ldc2013t19}, author={Weischedel, Ralph and Palmer, Martha and Marcus, Mitchell and Hovy, Eduard and Pradhan, Sameer and Ramshaw, Lance and Xue, Nianwen and Taylor, Ann and Kaufman, Jeff and Franchini, Michelle and others}, journal={Linguistic Data Consortium, Philadelphia, PA}, year={2013} }

https://catalog.ldc.upenn.edu/docs/LDC2013T19/OntoNotes-Release-5.0.pdf

Thanks

danyaljj commented 6 years ago

Hi not an issue but I would like to know why I get an overall of 76.4% accuracy on tweet entities for get_ner_conll and 82.15% using get_ner_ontonotes? Is it something you expect to get better accuracy? What is the rationale behind this difference?

The two models are different since they are trained on different datasets (and have different label set). I don't know much about your tweet dataset, but a better solution is to try a few examples in the demo: http://nlp.cogcomp.org/

I'd be careful with this tool since a few others have reported issues with the NER view #12 #81 etc. What is the range of performance for other toolboxes on this dataset?

cogcomp-nlp: Yeah you can cite that. ontonotes cotation: that looks right to me.

@mssammon can you confirm that we use Ontonotes 5?

FYI @danr-ccg

monajalal commented 6 years ago

We have the followings:

82.15% for yours using Ontonotes
82.7% for TwitterNLP
83.2% for Stanford NER 3 class model
86.8% for spaCy web_lg English model
96.5% for Google Cloud Natural Language NER

It is for the WiNLP workshop paper at NAACL HLT 2018 Comparing the Performance of Crowdworkers and NLP Tools on Named-Entity Recognition and Entity-Level Sentiment Analysis of Political Tweets Mona Jalal, Kate K. Mays, Lei Guo, and Margrit Betke http://naacl2018.org/downloads/naacl_hlt_2018_handbook.pdf

One thing I am observing is that many times @ are not considered as named entities even in obvious cases like @tedcruz

Also, when I used CoNLL, I got 50% accuracy for Cruz entities among 1000 tweets and now using Ontonotes I am getting 79.6%. The rest of entities don't have such problem. I think one possible suggestion if you want it to be applicable to tweets is to make it work for recognizable tweet handles like @tedcruz or even more obvious due to correct capitalization @HillaryClinton

Also, in one of my screenshots as you see SANDERS is not detected as an entity (most possibly because it is all capital case). screenshot from 2018-05-16 18-21-12 @danyaljj the only reason I told Ontonotes 5.0 was because you mentioned it in the swiss army knife LREC18 paper. I don't know as well if you use the same version in your current platform.

danyaljj commented 6 years ago

Thanks for sharing your findings and showing us your cool error analysis. We have not trained our system on any twitter datasets, hence many of these failures are expected. That said we will definitely try to improve the systems according to these mistakes.

One other thing; the version in current NLPy is "3.1.15". If you have time, try "4.0.3" too, which is the version used in the online demo as well. In order to do so, you can change the version in .ccg_nlpy/config.cfg, command the system to download the models again. I don't expect this to make a huge difference, but at least, it would be the performance of the latest version.

monajalal commented 6 years ago

Thank you so much for the explanation. I tried the newer version after stopping my server and starting it back again and still the same accuracies (CCR--correct classification rate--diag of confusion matrix).

CogComp / cogcomp-nlpy

why get_ner_ontonotoes is like 6% better than get_ner_conll? #93