dstl / baleen

Entity Extraction Text Processor
Apache License 2.0
147 stars 40 forks source link

Fixed corrupt document handling #23

Closed jle123 closed 8 years ago

jle123 commented 8 years ago

Previously, Baleen would write the message "'FILE CONTENTS CORRUPT - UNABLE TO PROCESS" to the JCas for any corrupt documents. The corresponding test would then assert that the JCas text was null. This fix makes sure that the JCas keeps this message and modifies the corresponding test to make sure that the JCas text equals this message.