Closed JSB97 closed 7 years ago
Does the same error occur when replacing the line 68 on the server code
start = incoming.find("<sentences>")
with
start = incoming.find("<root>")
?
This works in my environment. But the script is for much older version, and I'm not sure this simple fix is sufficient. I will commit it if it works for you.
I actually made this change already - and yes, I still get the above error.
The original server code made a call to transccg-0.2.jar
, which isn't in the 0.6.1 bundle of course, so here i am using jigg-0.6.1.jar
instead.
However despite the change when using jaccg annotation, or the full corenlp stack like the below;
./script/pipeline_server.py -P "-Xmx4g -cp jigg-0.6.1.jar jigg.pipeline.Pipeline -annotators corenlp[tokenize,ssplit,parse,lemma,ner,dcoref]"
The error is coming from the call to self.pipeline.expect()
. The below is suggesting that after the jigg command is run, the characters "> " aren't identified.
File "./script/pipeline_server.py", line 85, in
<pexpect.pty_spawn.spawn object at 0x110575850> command: /usr/bin/java args: ['/usr/bin/java', '-Xmx4g', '-cp', 'jigg-0.6.1.jar', 'jigg.pipeline.Pipeline', '-annotators', 'corenlp[tokenize,ssplit,parse,lemma,ner,dcoref]'] buffer (last 100 chars): '' before (last 100 chars): '\r\n\tat edu.stanford.nlp.parser.common.ParserGrammar.loadModel(ParserGrammar.java:185)\r\n\t... 30 more\r\n' after: <class 'pexpect.exceptions.EOF'> match: None match_index: None exitstatus: None flag_eof: True pid: 7438 child_fd: 6 closed: False timeout: 30 delimiter: <class 'pexpect.exceptions.EOF'> logfile: None logfile_read: None logfile_send: None maxread: 1000 ignorecase: False searchwindowsize: None delaybeforesend: 0.05 delayafterclose: 0.1 delayafterterminate: 0.1 searcher: searcher_string: 0: "> "
Probably the reason is that model files are missing for jaccg and corenlp.
Please try the following:
./script/pipeline_server.py -P "-Xmx4g -cp \"jigg-0.6.1.jar:stanford-corenlp-3.6.0-models.jar\" jigg.pipeline.Pipeline -annotators corenlp[tokenize,ssplit,parse,lemma,ner,dcoref]"
with the appropriate path to the model. jaccg is as well.
Returns the same error I am afraid -
pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform.
Sorry for late response. I revised the server code in the latest develop version so could you try it? It outputs the messages from Jigg when an error occurs. We may specify the reason of fail from those outputs.
Likewise - apologies for getting back late. With your try/exception modifications here are the outputs.
For JACCG, I ran the download script as instructed but get the below error anyway. As evidence that the download script ran successfully, I can see the jigg-0.6.1-models.jar
jar in the jigg-0.6.1
directory.
For CoreNLP it seems to be complaining about "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz"
not being available.
JACCG Command
./script/pipeline_server.py -P "-Xmx4g -cp jigg-0.6.1.jar jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg"
Output
INFO:__main__:java -Xmx4g -cp jigg-0.6.1.jar jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg
INFO:__main__:Spawning Jigg process...
ERROR:__main__:Failed to set up the pipeline! Maybe the arguments given to jigg are corrupted?
ERROR:__main__:Loading parser model in ccg-models/parser/beam=64.ser.gz ...Loading parser model in ccg-models/parser/beam=64.ser.gz ...Failed to start CCG parser. Make sure the model file of CCG is already installed. If not, execute the following command in jigg directory:
./script/download_models.sh
jaccg.model < str>: Path to the trained model (you can omit this if you load a jar which packs models) []
CoreNLP with all annotations Command
./script/pipeline_server.py -P "-Xmx4g -cp jigg-0.6.1.jar jigg.pipeline.Pipeline -annotators corenlp[tokenize,ssplit,parse,lemma,ner,dcoref]"
Output
INFO:__main__:Spawning Jigg process...
ERROR:__main__:Failed to set up the pipeline! Maybe the arguments given to jigg are corrupted?
ERROR:__main__:[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.IOException: Unable to open "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as class path, filename or URL
at edu.stanford.nlp.parser.common.ParserGrammar.loadModel(ParserGrammar.java:188)
at edu.stanford.nlp.pipeline.ParserAnnotator.loadModel(ParserAnnotator.java:212)
at edu.stanford.nlp.pipeline.ParserAnnotator.<init>(ParserAnnotator.java:115)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.parse(AnnotatorImplementations.java:150)
at edu.stanford.nlp.pipeline.AnnotatorFactories$11.create(AnnotatorFactories.java:463)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:375)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:139)
at jigg.pipeline.StanfordCoreNLPAnnotator.<init>(StanfordCoreNLPAnnotator.scala:60)
at jigg.pipeline.StanfordCoreNLPAnnotator$.fromProps(StanfordCoreNLPAnnotator.scala:784)
at jigg.pipeline.StanfordCoreNLPAnnotator$.fromProps(StanfordCoreNLPAnnotator.scala:747)
at jigg.pipeline.Pipeline$$anonfun$getAnnotator$1.apply(Pipeline.scala:185)
at jigg.pipeline.Pipeline$$anonfun$getAnnotator$1.apply(Pipeline.scala:185)
at scala.Option.map(Option.scala:146)
at jigg.pipeline.Pipeline.getAnnotator(Pipeline.scala:184)
at jigg.pipeline.Pipeline$$anonfun$5.apply(Pipeline.scala:136)
at jigg.pipeline.Pipeline$$anonfun$5.apply(Pipeline.scala:136)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.Iterator$class.foreach(Iterator.scala:742)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at jigg.pipeline.Pipeline.createAnnotatorList(Pipeline.scala:136)
at jigg.pipeline.Pipeline.annotatorList$lzycompute(Pipeline.scala:130)
at jigg.pipeline.Pipeline.annotatorList(Pipeline.scala:130)
at jigg.pipeline.Pipeline.close(Pipeline.scala:132)
at jigg.pipeline.Pipeline$.main(Pipeline.scala:399)
at jigg.pipeline.Pipeline.main(Pipeline.scala)
Caused by: java.io.IOException: Unable to open "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as class path, filename or URL
at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:485)
at edu.stanford.nlp.io.IOUtils.readObjectFromURLOrClasspathOrFileSystem(IOUtils.java:323)
at edu.stanford.nlp.parser.common.ParserGrammar.loadModel(ParserGrammar.java:185)
... 30 more
For both commands, the reason of errors seems to be that the model is not included in the current class path of java. You should modify the command as follows:
./script/pipeline_server.py -P "-Xmx4g -cp \"jigg-0.6.1.jar:jigg-0.6.1-models.jar\" jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg"
That is, the download model jar should be explicitly included in -cp
argument. The escape \"
is necessary here.
The error in corenlp seems similar. The corenlp model file should be stated in -cp
.
This is strange - I am using command above with the escapes but seem to get the same error... Any way to check that the jar file is valid perhaps??
$ ./script/pipeline_server.py -P "-Xmx4g -cp \"jigg-0.6.1.jar:jigg-0.6.1-models.jar\" jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg"
INFO:__main__:java -Xmx4g -cp "jigg-0.6.1.jar:jigg-0.6.1-models.jar" jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg
INFO:__main__:Spawning Jigg process...
ERROR:__main__:Failed to set up the pipeline! Maybe the arguments given to jigg are corrupted?
ERROR:__main__:Loading parser model in ccg-models/parser/beam=64.ser.gz ...Loading parser model in ccg-models/parser/beam=64.ser.gz ...Failed to start CCG parser. Make sure the model file of CCG is already installed. If not, execute the following command in jigg directory:
./script/download_models.sh
jaccg.model < str>: Path to the trained model (you can omit this if you load a jar which packs models) []
Excuse me cutting in.
In my case, -corenlp.ner.useSUTime false
option is needed for calling corenlp[ner]
annotator.
If you run jigg without this option, you may see the following message:
$ java -cp target/jigg-assembly-0.6.1.jar jigg.pipeline.Pipeline -annotators "corenlp[tokenize, ssplit, parse, lemma, ner, dcoref]"
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8 -Djava.library.path=/home/kyoshinaga/local/HDFJAVA-3.2.1-Linux/HDF_Group/HDFJAVA/3.2.1/lib
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
done [0.4 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.3 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.5 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.6 sec].
[main] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.1 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.4 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.5 sec].
[main] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1.
Exception in thread "main" edu.stanford.nlp.util.ReflectionLoading$ReflectionLoadingException: Error creating edu.stanford.nlp.time.TimeExpressionExtractorImpl
at edu.stanford.nlp.util.ReflectionLoading.loadByReflection(ReflectionLoading.java:40)
at edu.stanford.nlp.time.TimeExpressionExtractorFactory.create(TimeExpressionExtractorFactory.java:57)
at edu.stanford.nlp.time.TimeExpressionExtractorFactory.createExtractor(TimeExpressionExtractorFactory.java:38)
at edu.stanford.nlp.ie.regexp.NumberSequenceClassifier.<init>(NumberSequenceClassifier.java:82)
at edu.stanford.nlp.ie.NERClassifierCombiner.<init>(NERClassifierCombiner.java:85)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:108)
at edu.stanford.nlp.pipeline.AnnotatorFactories$6.create(AnnotatorFactories.java:333)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:375)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:139)
at jigg.pipeline.StanfordCoreNLPAnnotator.<init>(StanfordCoreNLPAnnotator.scala:60)
at jigg.pipeline.StanfordCoreNLPAnnotator$.fromProps(StanfordCoreNLPAnnotator.scala:797)
at jigg.pipeline.StanfordCoreNLPAnnotator$.fromProps(StanfordCoreNLPAnnotator.scala:754)
at jigg.pipeline.Pipeline$$anonfun$getAnnotator$1.apply(Pipeline.scala:181)
at jigg.pipeline.Pipeline$$anonfun$getAnnotator$1.apply(Pipeline.scala:181)
at scala.Option.map(Option.scala:146)
at jigg.pipeline.Pipeline.getAnnotator(Pipeline.scala:180)
at jigg.pipeline.Pipeline$$anonfun$5.apply(Pipeline.scala:139)
at jigg.pipeline.Pipeline$$anonfun$5.apply(Pipeline.scala:139)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
at scala.collection.Iterator$class.foreach(Iterator.scala:742)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at jigg.pipeline.Pipeline.createAnnotatorList(Pipeline.scala:139)
at jigg.pipeline.Pipeline.annotatorList$lzycompute(Pipeline.scala:133)
at jigg.pipeline.Pipeline.annotatorList(Pipeline.scala:133)
at jigg.pipeline.Pipeline.close(Pipeline.scala:135)
at jigg.pipeline.Pipeline$.main(Pipeline.scala:401)
at jigg.pipeline.Pipeline.main(Pipeline.scala)
Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public edu.stanford.nlp.time.TimeExpressionExtractorImpl(java.lang.String,java.util.Properties) with args [sutime, {}]
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:235)
at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:380)
at edu.stanford.nlp.util.ReflectionLoading.loadByReflection(ReflectionLoading.java:38)
... 32 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:231)
... 34 more
Caused by: java.lang.RuntimeException: Error initializing binder 1
at edu.stanford.nlp.time.Options.<init>(Options.java:92)
at edu.stanford.nlp.time.TimeExpressionExtractorImpl.init(TimeExpressionExtractorImpl.java:45)
at edu.stanford.nlp.time.TimeExpressionExtractorImpl.<init>(TimeExpressionExtractorImpl.java:39)
... 39 more
Caused by: java.lang.NullPointerException: Missing URL.
at de.jollyday.HolidayManager.getInstance(HolidayManager.java:190)
at edu.stanford.nlp.time.JollyDayHolidays.init(JollyDayHolidays.java:55)
at edu.stanford.nlp.time.Options.<init>(Options.java:90)
... 41 more
How do i make use of the -corenlp.ner.useSUTime false
option for the command?
Neither
./script/pipeline_server.py -P "-Xmx4g -cp \"jigg-0.6.1.jar:jigg-0.6.1-models.jar\" jigg.pipeline.Pipeline -corenlp.ner.useSUTime false -annotators corenlp[tokenize,ssplit,parse,lemma,ner,dcoref]"
not
./script/pipeline_server.py -P "-Xmx4g -cp \"jigg-0.6.1.jar:jigg-0.6.1-models.jar\" jigg.pipeline.Pipeline -annotators corenlp[tokenize,ssplit,parse,lemma,ner,dcoref] -corenlp.ner.useSUTime false"
seem to help.
@JSB97 The next line of your output
INFO:__main__:java -Xmx4g -cp "jigg-0.6.1.jar:jigg-0.6.1-models.jar" jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg
says it tried to do call this command. Could you try this command in the terminal? If this does work, please tell me your current environment (OS etc).
Sure here is the output of the command;
I am on using the below (let me know if there are other dependencies you need) Mac OS X (10.11.2) java version "1.8.0_65" Java(TM) SE Runtime Environment (build 1.8.0_65-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
Loading parser model in ccg-models/parser/beam=64.ser.gz ...Loading parser model in ccg-models/parser/beam=64.ser.gz ...Failed to start CCG parser. Make sure the model file of CCG is already installed. If not, execute the following command in jigg directory:
./script/download_models.sh
jaccg.model < str>: Path to the trained model (you can omit this if you load a jar which packs models) []
The output suggests you failed to load the model jar from the terminal as well. Is "jigg-0.6.1.models.jar" placed in the current directory?
There was indeed a models.jar file, but it seems the ./download_scripts.sh
script had not run successfully. The model file size was only few bytes (which doesn't make sense), so edited the script. All good now :+1:
Running the server as;
./script/pipeline_server.py -P "-Xmx4g -cp \"jigg-0.6.1.jar:jigg-0.6.1-models.jar\" jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg"
and passing data to the server as;
./script/client.py -i some_japanese.txt
now gives the desired result;
<root>
<document id="d0">
<sentences>
<sentence characterOffsetBegin="0" characterOffsetEnd="9" id="s0">
今日の天気は良い。
...
I can run jaccg on the terminal without issue. When using the below however;
The below error is returned.
I have extended the timeout time to allow the necessary libraries/packages to be loaded, but there still seems to be a problem in returning the command prompt character ">".
Using mecab, cabocha annotators works fine. The error is returned with jaccg only.
Running corenlp with all annotation options also returns a similar error;
ERROR
INFO:__main__:java -Xmx4g -cp jigg-0.6.1.jar jigg.pipeline.Pipeline -annotators ssplit,mecab,jaccg INFO:__main__:Spawn done! Traceback (most recent call last): File "./script/pipeline_server.py", line 83, in <module> pipeline = Pipeline(options.pipeline) File "./script/pipeline_server.py", line 20, in __init__ self.pipeline.expect("> ", timeout=5000) File "/usr/local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 321, in expect timeout, searchwindowsize, async) File "/usr/local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 345, in expect_list return exp.expect_loop(timeout) File "/usr/local/lib/python2.7/site-packages/pexpect/expect.py", line 105, in expect_loop return self.eof(e) File "/usr/local/lib/python2.7/site-packages/pexpect/expect.py", line 50, in eof raise EOF(msg) pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform. <pexpect.pty_spawn.spawn object at 0x10d356810> command: /usr/bin/java args: ['/usr/bin/java', '-Xmx4g', '-cp', 'jigg-0.6.1.jar', 'jigg.pipeline.Pipeline', '-annotators', 'ssplit,mecab,jaccg'] buffer (last 100 chars): '' before (last 100 chars): ' < str>: Path to the trained model (you can omit this if you load a jar which packs models) []\r\n' after: <class 'pexpect.exceptions.EOF'> match: None match_index: None exitstatus: None flag_eof: True pid: 81078 child_fd: 6 closed: False timeout: 30 delimiter: <class 'pexpect.exceptions.EOF'> logfile: None logfile_read: None logfile_send: None maxread: 2000 ignorecase: False searchwindowsize: None delaybeforesend: 0.05 delayafterclose: 0.1 delayafterterminate: 0.1 searcher: searcher_re: 0: re.compile("> ")