What steps will reproduce the problem?
1. Tesseract 3.02+ command line
2. "tesseract -l eng Image_crop.png Image pdf"
What is the expected output? What do you see instead?
> I expect tesseract to run and produce output
> Instead, Tesseract crashes with "ACCESS VIOLATION (0xC0000005)"-type error.
What version of the product are you using? On what operating system?
Seen in Tesseract 3.02.02 and code from SVN around March 2015.
Windows 7
Win32-bit Tesseract builds.
Please provide any additional information below.
- Doesn't happen in 64-bit Windows build (lucky?)
- Attached image has non-white pixels at image edges - this seems to trigger
this crash bug.
- Access violation occurs in TextlineProjection::MeanPixelsInLineSegment() when
it calls GET_DATA_BYTE() (~line 550). This can break when start_pt/end_pt Y
values = 0 and offset is a negative value. This can also break when
start_pt/end_pt Y value = bottom of image and offset is a positive value.
These conditions lead to an attempted reads of data either before or after the
image buffer.
- Other problems would occur horizontally (i.e. X value = 0 or right edge of
image). In these cases there is less chance of stepping outside the image
buffer (unless at a corner), but good chance that the algorithm will not read
the intended data due to wrapping to other image side.
Original issue reported on code.google.com by rtaylor...@gmail.com on 8 Jul 2015 at 10:53
Original issue reported on code.google.com by
rtaylor...@gmail.com
on 8 Jul 2015 at 10:53Attachments: