JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
https://www.jaided.ai
Apache License 2.0
24.64k stars 3.17k forks source link

`easyocr.Reader.readtext(...)` in rare occasions returns bounding box with `float` coordinates, and not `int` #1307

Open ScheiBig opened 2 months ago

ScheiBig commented 2 months ago

I'm using EasyOCR to make very simple banknote recognition for uni project, using Python and OpenCV. If I understand correctly provided examples, code below:

import cv2
import numpy as np
import easyocr

cap = cv2.VideoCapture(...)
reader = easyocr.Reader(["en"])
did_read, frame = cap.read()

# some frame preprocessing if necessary - cropping to area of interest, adding filters and thresholding

read_txts = reader.readtext(processed_frame)

should produce result, which I type-hint in my code as:

eOcr_res = tuple[
    tuple[tuple[int, int], tuple[int, int], tuple[int, int], tuple[int, int]], # bounding box
    str, # label
    float # confidence (0.0 .. 1.0)
]

of course actual result uses lists instead of tuples for bounding box, but this allows slightly better type-checking, since you cannot type-hint list with constant length, but this doesn't really matter.

What does matter, is that it should be possible to use this output directly, to draw result on image using OpenCV:

for read_txt in read_txts:
    box, txt, conf = read_txt
    box = np.array(box)
    cv2.putText(
        frame,
        txt,
        tuple(box.max(axis= 0)),
        0.75,
        (0, 255, 0),
        1
    )
    cv2.drawContours(
        frame,
        [box],
        -1,
        (0, 255, 0),
        2
    )

However on some rare occasions, snippet of code above would throw on cv2.drawContours, with error message: cv2.error: OpenCV(4.10.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\drawing.cpp:2504: error: (-215:Assertion failed) npoints > 0 in function 'cv::drawContours'

On closer inspection in debugger, it seems that when error occurs, reader.readtext(...) returns result in which bounding box points are of type float, and not the expected int (int points are being returned +99% of times): image

Of course this can be fixed in user code, which in snipped above would be:

box = np.array(box, dtype= np.int_)

however I feel that either examples in this repository in readme.md and on site [https://www.jaided.ai/easyocr/tutorial/]() are misleading, showing that only int numbers can be expected in bounding box component of output, or there is some rare bug which results in non-integer output.

ScheiBig commented 2 months ago

Sorry that I forgot to specify, I'm using EasyOCR version 1.7.1, with Python 3.12.6 on Windows 11.

daniellovera commented 2 months ago

Bounding boxes returned as polys can be returned with float coords.

From the API documentation - "Return horizontal_list, free_list - horizontal_list is a list of regtangular text boxes. The format is [x_min, x_max, y_min, y_max]. free_list is a list of free-form text boxes. The format is [[x1,y1],[x2,y2],[x3,y3],[x4,y4]]."