jwilk-archive / ocrodjvu

OCR for DjVu
GNU General Public License v2.0
45 stars 19 forks source link

Crash with empty page #7

Closed jwilk closed 8 years ago

jwilk commented 11 years ago

Issue reported by GStager at Bitbucket:


- Page #9
Page 0
Empty page!!
Exception while processing page 9:
Traceback (most recent call last):
  File "/home/stager/ocrodjvu-0.7.16/lib/cli/ocrodjvu.py", line 418, in page_thread
    result = self.process_page(page)
  File "/home/stager/ocrodjvu-0.7.16/lib/cli/ocrodjvu.py", line 401, in process_page
    page_size=size
ValueError: need more than 0 values to unpack

I think that empty page is a normal.

jwilk commented 11 years ago

It certainly shouldn't crash on an empty page. Unfortunately, I can't reproduce it here. Which exactly version of Tesseract is that?

jwilk commented 11 years ago

Comment submitted by GStager at Bitbucket:

Sorry for the delay.

tesseract 3.02.02

leptonica-1.69

libgif 4.1.6 : libjpeg 8b : libpng 1.2.46 : libtiff 3.9.5 : zlib 1.2.3.4

jwilk commented 11 years ago

Comment submitted by dolfik at Bitbucket:

I'm facing the same problem:

#!bash
root@localhost:~/temp# ocrodjvu -e tesseract -o "ocr/1a.djvu" --language pol "1a.djvu"
Processing '1a.djvu':
- Page #1
Exception while processing page 1:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/ocrodjvu/cli/ocrodjvu.py", line 418, in page_thread
    result = self.process_page(page)
  File "/usr/local/lib/python2.7/dist-packages/ocrodjvu/cli/ocrodjvu.py", line 401, in process_page
    page_size=size
ValueError: need more than 0 values to unpack
Intermediate files were left in the '/tmp/ocrodjvu.bQNUYv' directory.

tesseract 3.02 ocrodjvu 0.7.16

jwilk commented 10 years ago

Can I get a DjVu that triggers the bug for you?

jwilk commented 9 years ago

Ping?

jwilk commented 8 years ago

This should be fixed in 2c1164731715aa897bb08f63791427de1f61569f.

jwilk commented 8 years ago

Fixed in 0.9.2.