RichJackson / cogstack

Database - Elasticsearch realtime mapping. With NLP goodiness.
Apache License 2.0
7 stars 2 forks source link

[BUG] Error in GATE apps will block the whole pipeline #27

Closed hkkenneth closed 7 years ago

hkkenneth commented 7 years ago

p.s. I have poolSize = 1

Sometimes when the database has a wrong file without text context, the GATE pipeline may fail. Since GateService relies on LinkedBlockingQueue, if there is an error in processDoc() at line controller.execute();, it won't get the chance to reach genericQueue.put(controller);.

Example of such error

gate.creole.ExecutionException: No sentences or tokens to process in document GATE Document_0000C Please run a sentence splitter and tokeniser first! 
    at gate.creole.POSTagger.execute(POSTagger.java:257) 
    at gate.util.Benchmark.executeWithBenchmarking(Benchmark.java:291) 
    at gate.creole.ConditionalSerialController.runComponent(ConditionalSerialController.java:163) 
    at gate.creole.SerialController.executeImpl(SerialController.java:157) 
    at gate.creole.ConditionalSerialAnalyserController.executeImpl(ConditionalSerialAnalyserController.java:225) 
    at gate.creole.ConditionalSerialAnalyserController.execute(ConditionalSerialAnalyserController.java:132) 
    at uk.ac.kcl.service.GateService.processDoc(GateService.java:135) 
    at uk.ac.kcl.itemProcessors.GateDocumentItemProcessor.lambda$process$0(GateDocumentItemProcessor.java:82) 
    at java.util.HashMap.forEach(HashMap.java:1288) 
    at uk.ac.kcl.itemProcessors.GateDocumentItemProcessor.process(GateDocumentItemProcessor.java:75) 
    at uk.ac.kcl.itemProcessors.GateDocumentItemProcessor.process(GateDocumentItemProcessor.java:37) 
    at org.springframework.batch.item.support.CompositeItemProcessor.processItem(CompositeItemProcessor.java:61) 
    at org.springframework.batch.item.support.CompositeItemProcessor.process(CompositeItemProcessor.java:50) 
    at org.springframework.batch.core.step.item.SimpleChunkProcessor.doProcess(SimpleChunkProcessor.java:126) 
    at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor$1.doWithRetry(FaultTolerantChunkProcessor.java:225) 
    at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:263) 
    at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:193) 
    at org.springframework.batch.core.step.item.BatchRetryTemplate.execute(BatchRetryTemplate.java:217) 
    at org.springframework.batch.core.step.item.FaultTolerantChunkProcessor.transform(FaultTolerantChunkProcessor.java:290) 
    at org.springframework.batch.core.step.item.SimpleChunkProcessor.process(SimpleChunkProcessor.java:192) 
    at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:75) 
    at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:406) 
    at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:330) 
    at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133) 
    at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:271) 
    at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:81) 
    at org.springframework.batch.repeat.support.TaskExecutorRepeatTemplate$ExecutingRunnable.run(TaskExecutorRepeatTemplate.java:262) 
    at org.springframework.core.task.SimpleAsyncTaskExecutor$ConcurrencyThrottlingRunnable.run(SimpleAsyncTaskExecutor.java:251) 
    at java.lang.Thread.run(Thread.java:745)
hkkenneth commented 7 years ago

Reopen this. It is not enough to just catch ExecutionException, because JAPE logic or other things in GATE may trigger any other exception (e.g. as simple as StringIndexOutOfBoundsException).