OpenPhilology / nidaba

An expandable and scalable OCR pipeline
GNU General Public License v2.0
86 stars 12 forks source link

When segmentation jobs fail, batch crashes #25

Closed sonofmun closed 6 years ago

sonofmun commented 6 years ago

This may also be the case with other jobs failing. But I have noticed that as soon as I get a NidabaTesseractException because, I think, Tesseract segmentation craps out on empty pages. At this point, it appears that the segmentation jobs that are already in the queue finish, but the jobs after the segmentation jobs do not even start, even for the pages that did not get a segmentation error.

mittagessen commented 6 years ago

That is intended behavior and a result of the "hopefully-celery-doesn't-change-behavior-weekly-breaking-everything" rewrite of the task orchestration late last year. There is a refactoring that changes the behavior to run all runnable tasks in the tree but it relies on some celery functionality that is not widely used and therefore prone to breaking/changing behavior without notice so I haven't really tied everything together to make it usable.