Crashes in 3.0 when scanning text with long words (or long lines)

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. Run 3.0 with image that has long words or horizontal lines

The problem goes away with this temporary modification to tfacepp.cpp:

#if defined(THISDOESNTWORK)
  if (word->blob_list ()->length () > MAX_UNDIVIDED_LENGTH) {
    return split_and_recog_word (word, denorm, matcher, tester, trainer,
      testing, raw_choice, blob_choices,
      outword);
  } else {
#else
  {
#endif

It needs a longer-term solution in split_and_recog_word.  Note that
this splitting is never done in normal OCR, only with malformed text
or long horizontal colored lines (lots of gaps when thresholded).

Original issue reported on code.google.com by edhamr...@aol.com on 28 Aug 2009 at 8:45

GoogleCodeExporter commented 9 years ago

can you provide example image for this issue so the latest code can be tested?

Original comment by zde...@gmail.com on 2 Aug 2011 at 8:02

GoogleCodeExporter commented 9 years ago

It's possible this was fixed with r675.  Please try your image again (reopen 
with attachment if this still is an issue).

Original comment by david.e...@gmail.com on 19 Feb 2012 at 10:13

Changed state: Fixed

AmitGorvadiya / tesseract-ocr

Crashes in 3.0 when scanning text with long words (or long lines) #244