mynlp / jigg

Pipeline framework for easy natural language processing
Apache License 2.0
74 stars 20 forks source link

Bug when using the annotators berkeleyparser and corenlp #92

Closed fyamamoto10 closed 6 years ago

fyamamoto10 commented 6 years ago

When I run JIGG using the annotators corenlp[tokenize,ssplit],berkeleyparser,corenlp[lemma,ner,dcoref], the following error is happened (regardless of the length of the sentence):

...
[main] WARN edu.stanford.nlp.ie.NumberNormalizer - java.lang.NumberFormatException: Bad number put into wordToNumber.  Word is: "pm", originally part of "pm o", piece # 0
  edu.stanford.nlp.ie.NumberNormalizer.wordToNumber(NumberNormalizer.java:381)
  edu.stanford.nlp.ie.NumberNormalizer.findNumbers(NumberNormalizer.java:721)
  edu.stanford.nlp.ie.NumberNormalizer.findAndMergeNumbers(NumberNormalizer.java:810)
  edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressions(TimeExpressionExtractorImpl.java:190)
  edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressions(TimeExpressionExtractorImpl.java:184)
  edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressionCoreMaps(TimeExpressionExtractorImpl.java:115)
  edu.stanford.nlp.time.TimeExpressionExtractorImpl.extractTimeExpressionCoreMaps(TimeExpressionExtractorImpl.java:105)
  edu.stanford.nlp.ie.regexp.NumberSequenceClassifier.runSUTime(NumberSequenceClassifier.java:345)
  edu.stanford.nlp.ie.regexp.NumberSequenceClassifier.classifyWithSUTime(NumberSequenceClassifier.java:143)
  edu.stanford.nlp.ie.regexp.NumberSequenceClassifier.classifyWithGlobalInformation(NumberSequenceClassifier.java:106)
  edu.stanford.nlp.ie.NERClassifierCombiner.recognizeNumberSequences(NERClassifierCombiner.java:368)
  edu.stanford.nlp.ie.NERClassifierCombiner.classifyWithGlobalInformation(NERClassifierCombiner.java:311)
  edu.stanford.nlp.ie.AbstractSequenceClassifier.classifySentenceWithGlobalInformation(AbstractSequenceClassifier.java:343)
  edu.stanford.nlp.pipeline.NERCombinerAnnotator.doOneSentence(NERCombinerAnnotator.java:290)
  edu.stanford.nlp.pipeline.SentenceAnnotator.annotate(SentenceAnnotator.java:102)
  edu.stanford.nlp.pipeline.NERCombinerAnnotator.annotate(NERCombinerAnnotator.java:253)
  edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76)
  edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:660)
...
Exception in thread "main" java.lang.NullPointerException
    at edu.stanford.nlp.coref.md.DependencyCorefMentionFinder.extractNPorPRPFromDependency(DependencyCorefMentionFinder.java:104)
    at edu.stanford.nlp.coref.md.DependencyCorefMentionFinder.findMentions(DependencyCorefMentionFinder.java:63)
    at edu.stanford.nlp.pipeline.CorefMentionAnnotator.annotate(CorefMentionAnnotator.java:154)
    at edu.stanford.nlp.pipeline.DeterministicCorefAnnotator.annotate
...

There are two types in this error: (1) [main] WARN edu.stanford.nlp.ie.NumberNormalizer and (2) Exception in thread "main" java.lang.NullPointerException. I confirmed that the error (1) happens when using the annotators ner. The error (2) does not happen when using the ner, but it happens when using the dcoref.