1. Create image file with single quote in name. For example
"C:\temp\tesseract's_fail.png"
2. Run tesseract for hocr output (tesseract.exe "C:\temp\tesseract's_fail.png"
"C:\temp\hocr" hocr)
3. Result file hocr.html will be invalid.
Single quote not escaped in title attribute in page div. For me it's
<div class='ocr_page' id='page_1' title='image "C:\temp\tesseract's_fail.png";
bbox 0 0 275 297; ppageno 0'>
windows 7
tesseract-3.02
Original issue reported on code.google.com by irodio...@biarum.com on 8 May 2014 at 9:24
Original issue reported on code.google.com by
irodio...@biarum.com
on 8 May 2014 at 9:24