When choosing the 'tokenize' option on the online demo, NER gives great results, but in my python program I get different results which seem equivalent with the demo's results when the option 'tokenize' is turned off.
Now I've noticed there is an option to do tokenization before NER in polyglot line command as such : polyglot --lang en tokenize --input testdata/cricket.txt | polyglot --lang en ner | tail -n 20, but there is no such option when doing NER from python. Does that mean that tokenization before NER is done automatically?
And if so, why are the results from the python program using polyglot different than the online demo for NER polyglot?
Hi,
I'm using polyglot 16.7.4 and python3.7
For the NER problem I noticed different results with my python program and the online demo at this link: https://sites.google.com/site/rmyeid/projects/polylgot-ner
When choosing the 'tokenize' option on the online demo, NER gives great results, but in my python program I get different results which seem equivalent with the demo's results when the option 'tokenize' is turned off.
Now I've noticed there is an option to do tokenization before NER in polyglot line command as such : polyglot --lang en tokenize --input testdata/cricket.txt | polyglot --lang en ner | tail -n 20, but there is no such option when doing NER from python. Does that mean that tokenization before NER is done automatically?
And if so, why are the results from the python program using polyglot different than the online demo for NER polyglot?
Thank you.