Open stweil opened 6 years ago
I am still not sure whether it is an error when median_bottom_
still has its initial value or whether that is something which is normal and which should be handled. In any I expect that the integer overflow will results in wrong layout recognition, so it will be visible in the OCR result.
Maybe somebody finds a simpler test image which triggers the overflow, too. I think it must have a multi column layout. My test image is a little bit large and takes a lot of time for OCR.
It's a pity that -ftrapv
costs performance (about 35 % longer execution time according to my tests) – otherwise we could enable it always.
Did commit 7f911ac5e027ac8a fix this issue?
See also #320.
I am afraid that the current code still does not handle all cases which can result in an integer overflow, so more tests are needed with -ftrapv
enabled.
The functions VCoreOverlap and VSignificantCoreOverlap calculate integer differences which overflow when one of the operands is +-INT32_MAX (for example when
median_bottom_ == INT32_MAX). The GNU compiler can build code which detects integer overflow at runtime (compiler option
-ftrapv`). Tesseract then gets an unhandled trap and terminates.Overflows are triggered with this image by running
tesseract -l script/Fraktur 0604.jp2 0604
.