awslabs / handwritten-text-recognition-for-apache-mxnet

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.
Apache License 2.0
481 stars 189 forks source link

issue with resizing image(`resize_image()` function) from `ocr.utils.ims_dataset.py` for paragraph segmentation! #38

Closed naveen-marthala closed 4 years ago

naveen-marthala commented 4 years ago

During Paragraph Segmentation in "0_handwriting_ocr.ipynb", when paragraph_segmentation_transform(image, form_size) is called to paragraph-segment the image, which in turn calls resize_image() function from ocr.utils.iam_dataset.py to resize image(images i have passed are not from IAM dataset. I have passed my own images to images array for text recongition). Error occurs at line 72 of that file:

color = image[0][0]
    if color < 230:
        color = 230

the problem occurs becuase image[0][0] is an array, not a single value. How do i fix this and proceed further. Here is the screenshot of error: image

jonomon commented 4 years ago

I believe that by default, the IAMDataset class uses grayscale images. Therefore we assumed that the image is an array of w * h. Therefore image[0][0] is a scalar. If you are using coloured images, image[0][0] will be a vector of size 3 corresponding to R, G, and B. To use the provided code, please try to convert your images into grayscale.

naveen-marthala commented 4 years ago

will sure do that, thanks.