Closed sjscotti closed 1 year ago
Sorry for the late reply @sjscotti, but this seems like an issue that should be solved in the @OCR-D context, so adding @kba here.
This could also be relevant for our current benchmarking in OCR-D - do you have a rough idea how many pages were processed before Eynollah crashed?
IIUC this will be fixed by https://github.com/OCR-D/core/pull/966 when OCR-D is used through the Web API.
Closing here as https://github.com/OCR-D/core/pull/966 has now been merged.
Hi I seem to be sporadically crashing eynollah on one of a large number of images when running it as an OCR-D processor. This may happen after a large number of images were processed - which takes many hours to run. Because eynollah currently updates the
mets.xml
file with the segmentation files created only when the processor completes, all the results from that run are missing from themets.xml
file so an OCR cannot be performed on the successful segmentations. The two alternatives seem to be: 1) debug why eynollah is crashing (or eliminate the image causing the crash) and rerun all the images again, or 2) edit themets.xml
by hand to include the info for the successful segmentations that were done before the crash. Is there another approach that can be used if this case occurs? If not, how about including a flag in the OCR-D processor so that it periodically updates themets.xml
file with the info from the successful segmentations. Thanks!