jwilk-archive / ocrodjvu

OCR for DjVu
GNU General Public License v2.0
44 stars 19 forks source link

freeze if a page cannot be decoded #5

Closed jwilk closed 11 years ago

jwilk commented 11 years ago

Issue reported by GStager at Bitbucket:

- Page #389
- Page #390
- Page #391
- Page #392
Unexpected End Of File.
- Page #393

... and freeze.

File: http://libgen.org/get?nametype=md5&md5=02AFD32D47B492D04DDFBD2359772884

parms: ~/ocrodjvu-0.7.15/ocrodjvu --in-place -e tesseract -t words --html5 --clear-text -l rus+eng 02AFD32D47B492D04DDFBD2359772884.djvu

jwilk commented 11 years ago

As the warning suggests, page 392 is broken:

$ ddjvu -page=392 02AFD32D47B492D04DDFBD2359772884.djvu 
ddjvu: Cannot decode page 392.

But of course ocrodjvu shouldn't hang. I'll get that fixed soon.

jwilk commented 11 years ago

Fixed in 56aab44c7bd9b8f6df25614acbc72e4fdf76a512. Instead of hanging, ocrodjvu will throw an exception. Then you should be able to use --on-error=resume to skip over undecodable pages.

jwilk commented 11 years ago

Fixed in 0.7.16.