Closed Necrolis closed 10 years ago
As much as I love a good bug report (by which I mean one that is as well documented as yours) I am unable to reproduce the bug:
foo.txt:
rt @bob: I really hate fifa 2015. ya
java edu.stanford.nlp.pipeline.StanfordCoreNLP -file foo.txt -pos.model /u/nlp/data/pos-tagger/distrib/english-caseless-left3words-distsim.tagger
On Tue, Jul 22, 2014 at 12:05 PM, Necrolis notifications@github.com wrote:
When using the caseless pos-tagger, it is possible to trigger a null-reference exception in edu.stanford.nlp.dcoref.sievepasses.DeterministicCorefSieve.sortMentionsForPronoun when there is a dangling pronoun.
A simple repro-case using a simplified tweet that can trigger the exception: rt @bob: I really hate fifa 2015. ya
which yields this trace:
Exception in thread "main" java.lang.RuntimeException: Error annotating C:\Users***\Desktop\stanford-corenlp-full-2014-06-16\input.txt at edu.stanford.nlp.pipeline.StanfordCoreNLP$15.run(StanfordCoreNLP.java:1288) at edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles(StanfordCoreNLP.java:1348) at edu.stanford.nlp.pipeline.StanfordCoreNLP.run(StanfordCoreNLP.java:1390) at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1460) Caused by: java.lang.NullPointerException at edu.stanford.nlp.dcoref.sievepasses.DeterministicCorefSieve.sortMentionsForPronoun(DeterministicCorefSieve.java:482) at edu.stanford.nlp.dcoref.sievepasses.DeterministicCorefSieve.getOrderedAntecedents(DeterministicCorefSieve.java:464) at edu.stanford.nlp.dcoref.SieveCoreferenceSystem.coreference(SieveCoreferenceSystem.java:898) at edu.stanford.nlp.dcoref.SieveCoreferenceSystem.coref(SieveCoreferenceSystem.java:845) at edu.stanford.nlp.pipeline.DeterministicCorefAnnotator.annotate(DeterministicCorefAnnotator.java:121) at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:67) at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:848) at edu.stanford.nlp.pipeline.StanfordCoreNLP$15.run(StanfordCoreNLP.java:1276) ... 3 more
Admittedly this is not correct English in anyway, however it would be nice to see a little more robustness in the system :)
— Reply to this email directly or view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/31.
Seems I forgot a vital bit, which I hadn't realized till now (as I didn't test with other parse models); this error only occurs when the SR parser is used (via parse.model=edu/stanford/nlp/models/srparser/englishSR.ser.gz
).
Thanks for pointing this out. There was a weird case where the parser produced a tree that didn't have ROOT on top, followed by a usage of the parse trees in coref where the ending condition was ROOT or null, but it checked at the end of the loop instead of the start (presumably thinking no tree would ever fail to have ROOT at top). I have checked in a fix for the second issue and will try to fix the first issue as well.
John
On Tue, Jul 22, 2014 at 2:53 PM, Necrolis notifications@github.com wrote:
Seems I forgot the other vital bit, which I hadn't realized till now; this error only occurs when the SR parser is used (via parse.model=edu/stanford/nlp/models/srparser/englishSR.ser.gz).
— Reply to this email directly or view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/31#issuecomment-49806646.
Awesome! I've managed to collect quite a few examples that trigger this condition (some of which I think constitute correct grammatical forms), so just give me the heads up when I can test the changes out.
On a sort of side note: I hadn't noticed that the SR models received an update (your site doesn't make this too clear, I only released there was an update from the mailing list); comparing the June 16th models vs the July 1st models, I noticed the latter avoids crashing on a few cases we found. The example proposed before however still produces the crash.
If I remember correctly, you had pointed out some other bug which I fixed by putting out a new set of models and calling them the 3.4.0 models without actually changing the version number.
You should be able to test the changes from github already.
On Fri, Jul 25, 2014 at 2:48 PM, Necrolis notifications@github.com wrote:
Awesome! I've managed to collect quite a few examples that trigger this condition (some of which I think constitute correct grammatical forms), so just give me the heads up when I can test the changes out.
On a sort of side note: I hadn't noticed that the SR models received an update (your site doesn't make this too clear, I only released there was an update from the mailing list); comparing the June 16th models vs the July 1st models, I noticed the latter avoids crashing on a few cases we found. The example proposed before however still produces the crash.
— Reply to this email directly or view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/31#issuecomment-50208680.
Did some testing with the new changes, everything seems to be running smoothly :)
Thanks for the quick fix!
When using the caseless pos-tagger, it is possible to trigger a null-reference exception in
edu.stanford.nlp.dcoref.sievepasses.DeterministicCorefSieve.sortMentionsForPronoun
when there is a dangling pronoun.A simple repro-case using a simplified tweet that can trigger the exception:
rt @bob: I really hate fifa 2015. ya
which yields this trace:
Admittedly this is not correct English in anyway, however it would be nice to see a little more robustness in the system :)