Describe the bug
I use CoreNLP 3.9.2 as a remote server with StanfordNLP 0.2.0.
I want to work on pre-tokenized inputs by specifying the following properties
I received the following NullPointer error:
Client-side
starting up Java Stanford CoreNLP Server...
Starting server with command: java -Xmx16G -cp /myhome/stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-5074c5d23e934dbf.props -preload tokenize,ssplit,pos,lemma,ner,parse,depparse,coref
Traceback (most recent call last):
File "/myhome/.local/lib/python3.6/site-packages/stanfordnlp/server/client.py", line 330, in _request
r.raise_for_status()
File "/myhome/github_repo/requests/requests/models.py", line 840, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%27outputFormat%27%3A+%27serialized%27%7D
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 24, in <module>
ann = client.annotate(text)
File "/myhome/.local/lib/python3.6/site-packages/stanfordnlp/server/client.py", line 398, in annotate
r = self._request(text.encode('utf-8'), request_properties, **kwargs)
File "/myhome/.local/lib/python3.6/site-packages/stanfordnlp/server/client.py", line 336, in _request
raise AnnotationException(r.text)
stanfordnlp.server.client.AnnotationException: java.util.concurrent.ExecutionException: java.lang.NullPointerException
Server-side
[pool-1-thread-3] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref
java.util.concurrent.ExecutionException: java.lang.NullPointerException
at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:205)
at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.handle(StanfordCoreNLPServer.java:870)
at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:77)
at jdk.httpserver/sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:82)
at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:80)
at jdk.httpserver/sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:692)
at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:77)
at jdk.httpserver/sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:664)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.NullPointerException
at edu.stanford.nlp.pipeline.NERCombinerAnnotator.annotate(NERCombinerAnnotator.java:322)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:637)
at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.lambda$handle$0(StanfordCoreNLPServer.java:857)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
... 3 more
To Reproduce
Run the following example, which is modified from the document example by adding the properties and change the input texts with EOL.
from stanfordnlp.server import CoreNLPClient
# example text
print('---')
print('input text')
print('')
text = "Chris Manning is a nice person.\nChris wrote a simple sentence. He also gives oranges to people.\ntest it."
print(text)
# set up the client
print('---')
print('starting up Java Stanford CoreNLP Server...')
# set up the client
properties = {
'tokenize.whitespace': True,
'tokenize.keepeol': True,
'ssplit.eolonly': True
}
with CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse', 'depparse','coref'], timeout=30000, memory='16G', properties=properties) as client:
# submit the request to the server
ann = client.annotate(text)
# get the first sentence
sentence = ann.sentence[0]
# get the constituency parse of the first sentence
print('---')
print('constituency parse of first sentence')
constituency_parse = sentence.parseTree
print(constituency_parse)
# get the first subtree of the constituency parse
print('---')
print('first subtree of constituency parse')
print(constituency_parse.child[0])
# get the value of the first subtree
print('---')
print('value of first subtree of constituency parse')
print(constituency_parse.child[0].value)
# get the dependency parse of the first sentence
print('---')
print('dependency parse of first sentence')
dependency_parse = sentence.basicDependencies
print(dependency_parse)
# get the first token of the first sentence
print('---')
print('first token of first sentence')
token = sentence.token[0]
print(token)
# get the part-of-speech tag
print('---')
print('part of speech tag of token')
token.pos
print(token.pos)
# get the named entity tag
print('---')
print('named entity tag of token')
print(token.ner)
# get an entity mention from the first sentence
print('---')
print('first entity mention in sentence')
print(sentence.mentions[0])
# access the coref chain
print('---')
print('coref chains for the example')
print(ann.corefChain)
# Use tokensregex patterns to find who wrote a sentence.
pattern = '([ner: PERSON]+) /wrote/ /an?/ []{0,3} /sentence|article/'
matches = client.tokensregex(text, pattern)
# sentences contains a list with matches for each sentence.
assert len(matches["sentences"]) == 3
# length tells you whether or not there are any matches in this
assert matches["sentences"][1]["length"] == 1
# You can access matches like most regex groups.
matches["sentences"][1]["0"]["text"] == "Chris wrote a simple sentence"
matches["sentences"][1]["0"]["1"]["text"] == "Chris"
# Use semgrex patterns to directly find who wrote what.
pattern = '{word:wrote} >nsubj {}=subject >dobj {}=object'
matches = client.semgrex(text, pattern)
# sentences contains a list with matches for each sentence.
assert len(matches["sentences"]) == 3
# length tells you whether or not there are any matches in this
assert matches["sentences"][1]["length"] == 1
# You can access matches like most regex groups.
matches["sentences"][1]["0"]["text"] == "wrote"
matches["sentences"][1]["0"]["$subject"]["text"] == "Chris"
matches["sentences"][1]["0"]["$object"]["text"] == "sentence"
Expected behavior
Serialized parsing results
Environment (please complete the following information):
OS: Ubuntu
Python version: 3.6.x
StanfordNLP version: 0.2.0
CoreNLP version: 3.9.2
Additional context
CoreNLP 3.9.1 doesn't have this issue.
Describe the bug I use CoreNLP 3.9.2 as a remote server with StanfordNLP 0.2.0. I want to work on pre-tokenized inputs by specifying the following properties
I received the following NullPointer error: Client-side
Server-side
To Reproduce Run the following example, which is modified from the document example by adding the properties and change the input texts with EOL.
Expected behavior Serialized parsing results
Environment (please complete the following information):
Additional context CoreNLP 3.9.1 doesn't have this issue.