TeluguOCR / banti_telugu_ocr

End to end OCR system for Telugu. Based on Convolutional Neural Networks.
Apache License 2.0
48 stars 17 forks source link

Errors: When combining two glyphs #16

Closed ChillarAnand closed 7 years ago

ChillarAnand commented 7 years ago

banti was working well with Pillow==3.1.1. In another system, that pillow version is not getting installed. So I have installed latest version Pillow==3.4.2. Now it throws this error with given sample file.

chillaranand@pavilion:~/projects/python/ocr/banti_telugu_ocr on git:master o |
→ python recognize.py sample_images/praasa.tif
Command line Arguments
        calibration         1
        input_file_or_dir   sample_images/praasa.tif
        labels_fname        labellings/alphacodes.lbl
        log_level           20
        ngram_fname         library/mega.123.pkl
        nnet_fname          library/nn.pkl
        scaler_fname        scalings/relative48.scl

/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved t
o the theano.tensor.signal.pool module.
  "downsample module has been moved to the theano.tensor.signal.pool module.")
Initializing the OCR
Compiling full test function...
         OCR initialized.
************************************************************
PROCESSING sample_images/praasa.tif
Classifing glyphs...
/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/page.py:341: VisibleDeprecationWarning: using a non-integer number instead of an intege
r will result in an error in the future
  horz_buffer = np.zeros((self.ht, brick_wd))
/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/numpy/core/numeric.py:190: VisibleDeprecationWarning: using a non-integer number inst
ead of an integer will result in an error in the future
  a = empty(shape, dtype, order)
/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/page.py:352: VisibleDeprecationWarning: using a non-integer number instead of an intege
r will result in an error in the future
  self.word_closed_arr = self.word_closed_arr[:, brick_wd:-brick_wd]
Finding most likely sentences...
Line  0
Traceback (most recent call last):
  File "recognize.py", line 154, in <module>
    ocr_pattern(args.input_file_or_dir)
  File "recognize.py", line 149, in ocr_pattern
    recognizer.ocr_file(inpt)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/ocr.py", line 63, in ocr_file
    gramgraph.process_tree()
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 83, in process_tree
    self.process_node(0)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 47, in process_node
    self.process_node(chld_id)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/linegraph.py", line 58, in process_node
    do_combine, new_wt = chld_wt.combine(gc_wt)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/proglyph.py", line 124, in combine
    combined = self + other
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 128, in __add__
    summ = self.__class__()
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/proglyph.py", line 100, in __init__
    super().__init__(info)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 29, in __init__
    self.init_from_box_6pack_list(['', 0, 0, 0, 0, 0, 0, 0, 0, None])
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 64, in init_from_box_6pack_list
    self.pix_from_sixpack()
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 100, in pix_from_sixpack
    self.set_pix(pix)
  File "/home/chillaranand/projects/python/ocr/banti_telugu_ocr/banti/glyph.py", line 104, in set_pix
    self.img = im.fromarray(255 * (1 - self.pix))
  File "/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/PIL/Image.py", line 2187, in fromarray
    return frombuffer(mode, size, obj, "raw", rawmode, 0, 1)
  File "/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/PIL/Image.py", line 2114, in frombuffer
    _check_size(size)
  File "/home/chillaranand/.virtualenvs/p35/lib/python3.5/site-packages/PIL/Image.py", line 2001, in _check_size
    raise ValueError("Width and Height must be > 0")
ValueError: Width and Height must be > 0
rakeshvar commented 7 years ago

I thought this was fixed in this commit.

rakeshvar commented 7 years ago

May be the above commit fixes the same problem but elsewhere. Also, Shouldn't the requirements file say >= in stead of == ?

ChillarAnand commented 7 years ago

It is better not to have >=. If upstream packages introduces any changes, people using banti face problems and start complaining here.

Maintainers should first verify with new versions and have to update requirements file accordingly.

rakeshvar commented 7 years ago

Makes sense. That is what is happening here. Upstream Pillow is not supporting zero as width and height.

>>> from PIL import Image as im
>>> import numpy as np
>>> im.fromarray(np.empty((0, 0), dtype=np.uint8))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.4/dist-packages/PIL/Image.py", line 2187, in fromarray
    return frombuffer(mode, size, obj, "raw", rawmode, 0, 1)
  File "/usr/local/lib/python3.4/dist-packages/PIL/Image.py", line 2114, in frombuffer
    _check_size(size)
  File "/usr/local/lib/python3.4/dist-packages/PIL/Image.py", line 2001, in _check_size
    raise ValueError("Width and Height must be > 0")
ValueError: Width and Height must be > 0

Do we file a bug there or update our code to not depend on 0x0 images?

rakeshvar commented 7 years ago

I filed a bug there.

ChillarAnand commented 7 years ago

It is better to fix it in upstream. Thanks.