Closed GoogleCodeExporter closed 9 years ago
The textline finder in tesseract uses random numbers as part of a least median
of
squares fitting algorithm. This produces different results, especially on short
text
lines such as yours. This is a known issue that will be fixed in a future
release.
Original comment by theraysm...@gmail.com
on 20 Aug 2009 at 4:30
Hi Ray,
I have the same issue and wanted to add that in my case, the 1st scan is
absolutely
perfect but the 2nd scan is not only different, it totally misses out for the
bottom
half of the text area. See attached image:
- first scan finds "Toll Free: 877.887.1818" and "www.serenitymovers.com"
- 2nd scan finds: "Toll Free: 877.887.1818" and "WWW.SIIIi1Zy|11OV\S.COT\"
So it seems that the usage of the random variable makes it go from perfect to
very
poor ... hope this helps. I m using Tesseract version 2.03 by the way.
Original comment by patrick....@gmail.com
on 26 Aug 2009 at 1:37
Attachments:
Fixed in 3.01
Original comment by theraysm...@gmail.com
on 20 May 2010 at 6:31
3.02 gives different output for the same input.
Original comment by ydf...@gmail.com
on 14 Jul 2015 at 1:11
Original issue reported on code.google.com by
carlamar...@gmail.com
on 18 Aug 2009 at 8:34Attachments: