dstl / baleen

Entity Extraction Text Processor
Apache License 2.0
148 stars 40 forks source link

Temporal parsing fails when timezone is not included #35

Closed jonnyelliot closed 7 years ago

jonnyelliot commented 7 years ago

Against 2.3 Snapshot Release -2016-11-01

Occurred on a document that contained dates: "20 January 2014" and "20 Jan 2014"

2016-11-23 14:03:20,106 WARN  uk.gov.dstl.baleen.core.pipelines.BaleenPipeline - Pipeline ran with errors
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.    
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:401)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:308)
    at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269)
    at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:893)
    at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:575)
Caused by: java.lang.NullPointerException: null
    at java.util.TimeZone.parseCustomTimeZone(TimeZone.java:783)
    at java.util.TimeZone.getTimeZone(TimeZone.java:562)
    at java.util.TimeZone.getTimeZone(TimeZone.java:516)
    at uk.gov.dstl.baleen.annotators.regex.DateTime.processDayMonthTime(DateTime.java:127)
    at uk.gov.dstl.baleen.annotators.regex.DateTime.doProcess(DateTime.java:46)
    at uk.gov.dstl.baleen.uima.BaleenAnnotator.process(BaleenAnnotator.java:81)
    at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:385)
jbaker-dstl commented 7 years ago

This has been fixed in our current development code - I'll look to get a new SNAPSHOT release done soon so we can close this off.