ADAH-EviDENce / NewsReader

Docker build of full NewsReader pipeline in Dutch.
Apache License 2.0
2 stars 4 forks source link

faulty normalization resource files for ixa-heideltime #35

Open wmkouw opened 6 years ago

wmkouw commented 6 years ago

Goal: run ixa-heideltime on input naf file

Input:

cat "$fn.naf" | java -jar $TIM/target/ixa.pipe.time.jar -m $TIM/lib/alpino-to-treetagger.csv -c $TIM/config.props> "$fn-tim.naf" 2> "$fn-tim.log"

Problem: multiple exceptions:

Stack trace:

May 15, 2018 1:46:52 PM ixa.pipe.time.IXAPipeHeidelTime initialize
INFO: IXAPipeHeidelTime initialized with language dutch
May 15, 2018 1:46:52 PM ixa.pipe.time.IXAPipeHeidelTime readConfigFile
INFO: trying to read in file ixa-heideltime/config.props
May 15, 2018 1:46:53 PM ixa.pipe.time.IXAPipeHeidelTime initialize
INFO: HeidelTime initialized
May 15, 2018 1:46:53 PM ixa.pipe.time.IXAPipeHeidelTime initialize
INFO: JCas factory initialized
May 15, 2018 1:46:53 PM ixa.pipe.time.IXAPipeHeidelTime process
INFO: Processing started
[de.unihd.dbs.uima.annotator.heideltime.HeidelTime] HeidelTime's execution has been interrupted by an exception that is likely rooted in faulty normalization resource files. Please consider opening an issue report containing the following information at our Google Code project issue tracker: https://code.google.com/p/heideltime. Thanks!
java.lang.NullPointerException
    at java.lang.String.replace(String.java:2240)
    at de.unihd.dbs.uima.annotator.heideltime.HeidelTime.applyRuleFunctions(HeidelTime.java:2270)
    at de.unihd.dbs.uima.annotator.heideltime.HeidelTime.getAttributesForTimexFromFile(HeidelTime.java:2373)
    at de.unihd.dbs.uima.annotator.heideltime.HeidelTime.findTimexes(HeidelTime.java:2197)
    at de.unihd.dbs.uima.annotator.heideltime.HeidelTime.process(HeidelTime.java:216)
    at ixa.pipe.time.IXAPipeHeidelTime.process(IXAPipeHeidelTime.java:317)
    at ixa.pipe.time.CLI.main(CLI.java:76)
[de.unihd.dbs.uima.annotator.heideltime.HeidelTime] Sentence [49343-49443]:  Mar dan... Ik kan het je precies laten zien, maar dan moet je op een kaart kijken, dat is vervelend
[de.unihd.dbs.uima.annotator.heideltime.HeidelTime] Language: DUTCH
[de.unihd.dbs.uima.annotator.heideltime.HeidelTime] Re-running this sentence with DEBUGGING enabled...
[de.unihd.dbs.uima.annotator.heideltime.HeidelTime] Execution will now resume.
java.lang.NumberFormatException: For input string: "'"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:569)
    at java.lang.Integer.parseInt(Integer.java:615)
    at de.unihd.dbs.uima.annotator.heideltime.HeidelTime.specifyAmbiguousValuesString(HeidelTime.java:813)
    at de.unihd.dbs.uima.annotator.heideltime.HeidelTime.specifyAmbiguousValues(HeidelTime.java:1861)
    at de.unihd.dbs.uima.annotator.heideltime.HeidelTime.process(HeidelTime.java:276)
    at ixa.pipe.time.IXAPipeHeidelTime.process(IXAPipeHeidelTime.java:317)
    at ixa.pipe.time.CLI.main(CLI.java:76)
May 15, 2018 1:46:59 PM ixa.pipe.time.IXAPipeHeidelTime process
WARNING: Processing aborted due to errors
May 15, 2018 1:46:59 PM ixa.pipe.time.IXAPipeHeidelTime format
WARNING: Two overlapping Timexes have been discovered:
Timex A: dagen lang ["REMOVE" / 19447:19457]
Timex B: veertien dagen ["P14D" / 19438:19452] [removed]
The writer chose, for granularity: dagen lang
This usually happens with an incomplete ruleset. Please consider adding a new rule that covers the entire expression.
May 15, 2018 1:46:59 PM ixa.pipe.time.IXAPipeHeidelTime process
INFO: Result formatted