By default, tesseract produces gibberish for me. I noticed that convert is commented out in favor of gs. I tried convert -depth 8 -background white -flatten -matte -density 300 <input> <output> instead and tesseract produced great results. The whole process was a lot faster too: ~15 minutes vs ~1 minute for 6 pages. I am curious why ghostscript is used rather than imagemagick for conversion?
By default, tesseract produces gibberish for me. I noticed that
convert
is commented out in favor ofgs
. I triedconvert -depth 8 -background white -flatten -matte -density 300 <input> <output>
instead and tesseract produced great results. The whole process was a lot faster too: ~15 minutes vs ~1 minute for 6 pages. I am curious why ghostscript is used rather than imagemagick for conversion?