stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.69k stars 2.7k forks source link

Dedicated server ignoring annotator list #149

Closed harryj closed 8 years ago

harryj commented 8 years ago

I'm running the CoreNLP dedicated server on AWS and trying to make a request from ruby. The server seems to be receiving the request correctly but the issue is the server seems to ignore the input annotators list and always default to all annotators. My Ruby code to make the request looks like so:

uri = URI.parse(URI.encode('http://ec2-************.compute.amazonaws.com//?properties={"tokenize.whitespace": "true", "annotators": "tokenize,ssplit,pos", "outputFormat": "json"}'))

http = Net::HTTP.new(uri.host, uri.port)
request = Net::HTTP::Post.new("/v1.1/auth")
request.add_field('Content-Type', 'application/json')
request.body = text
response = http.request(request)
json = JSON.parse(response.body)

In the nohup.out logs on the server I see the following:

[/38.122.182.107:53507] API call w/annotators tokenize,ssplit,pos,depparse,lemma,ner,mention,coref,natlog,openie .... INPUT TEXT BLOCK HERE .... [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [2.0 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse Loading depparse model file: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... PreComputed 100000, Elapsed Time: 2.259 (s) Initializing dependency parser done [5.1 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [2.6 sec]. Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1.2 sec]. Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [7.2 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1. Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt Feb 22, 2016 11:37:20 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules INFO: Read 83 rules Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt Feb 22, 2016 11:37:20 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules INFO: Read 267 rules Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt Feb 22, 2016 11:37:20 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules INFO: Read 25 rules [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator mention Using mention detector type: dependency [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref

etc etc.

Any help as to why this is happening would be appreicated thanks!

harryj commented 8 years ago

When I run test queries using wget on the command line it seems to work fine.

wget --post-data 'the quick brown fox jumped over the lazy dog' 'ec2-*******.compute.amazonaws.com/?properties={"tokenize.whitespace": "true", "annotators": "tokenize,ssplit,pos", "outputFormat": "json"}' -O -
harryj commented 8 years ago

I was constructing the request wrong.