dat / pyner

Python interface to the Stanford Named Entity Recognizer
Other
293 stars 89 forks source link

Empty set of entities #2

Closed muwaqar closed 11 years ago

muwaqar commented 12 years ago

I have installed Pyner successfully. However when I run the example, an empty set of entities is returned (indicated below):

$ python Python 2.7.3 (default, Sep 26 2012, 21:53:58) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information.

import ner tagger = ner.HttpNER(host='localhost', port=1234) tagger.get_entities("University of California is located in California, United States") {}

The command through which i am running Stanford NER is: java -mx1000m -cp stanford-ner.jar edu.stanford.nlp.ie.NERServer -loadClassifier classifiers/english.all.3class.distsim.crf.ser.gz -port 1234

dat commented 11 years ago

This is because you're deploying it via the Java server option using the normal socket. You should be building it as a .war file (see Makefile) and deploy it as a servlet under Tomcat.

dat commented 11 years ago

If this is the mode that you're interested in running, take a look at the SocketNER class. That is what you're looking for.

Liontooth commented 10 years ago

I found this worked on a fresh stanford-ner 3.4 and pyner --

$ cp stanford-ner.jar stanford-ner-with-classifier.jar
$ jar -uf stanford-ner-with-classifier.jar classifiers/english.all.3class.distsim.crf.ser.gz
$ java -mx500m -cp stanford-ner-with-classifier.jar edu.stanford.nlp.ie.NERServer -port 2020 
 -loadClassifier classifiers/english.all.3class.distsim.crf.ser.gz &

Define the tagger

tagger = ner.SocketNER(host='localhost', port=2020, output_format='slashTags')

If I leave out the output_format, or try another format, I get {} output. The actual output format, however, is not slashTags but a clearly superior (more informative) dict format:

tagger.get_entities(text)
{u'ORGANIZATION': [u'UNIVERSITY OF CALIFORNIA'], u'LOCATION': [u'CALIFORNIA', u'UNITED STATES'], u'O': [u'IS LOCATED IN', u',']}
Liontooth commented 10 years ago

Incidentally, I also discovered that pyner works fine for stanford-pos too, though none of the custom formatting of the result string works. Just use tag_text for raw output and you have pypos.

fintrader commented 8 years ago

I think this case should be re-opened. Even using SocketNER I experienced the same issue but once output_format was specified to "slashTags" everything worked well

Without specifying output_format:

tagger = ner.SocketNER(host='localhost',port=9191)
tagger.get_entities("University of California is located in California, United States")

Returned:

{}

With specifying output_format:

tagger = ner.SocketNER(host='localhost',port=9191, output_format='slashTags')
tagger.get_entities("University of California is located in California, United States")

Returned:

{u'LOCATION': [u'California', u'United States'],
 u'O': [u'is located in', u','],
 u'ORGANIZATION': [u'University of California']}

Either way, appreciate of the work you're doing dat. Thank you so much for all that you do