Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.67k stars 449 forks source link

lp.draw_text not working with TessEract engine #110

Open rathorology opened 2 years ago

rathorology commented 2 years ago

import cv2

try: from PIL import Image except ImportError: import Image import pytesseract import layoutparser as lp

ocr_agent = lp.TesseractAgent() pytesseract.pytesseract.tesseract_cmd = "Tesseract-OCR/tesseract.exe" path = "images/a.jpg" img = cv2.imread(path)

custom_oem_psm_config = r'--oem 3 --psm 6'

res = pytesseract.image_to_string(img, lang='eng', config=custom_oem_psm_config)

layout = ocr_agent.detect(img, return_response=True) print(layout)

lp.draw_text(img, layout, font_size=12, with_box_on_text=True, text_box_width=1)

ERROR:- Traceback (most recent call last): File "D:\PycharmProjects\ocr\layout_parser.py", line 22, in lp.draw_text(img, layout, font_size=12, with_box_on_text=True, File "C:\Users\adityara\AppData\Roaming\Python\Python39\site-packages\layoutparser\visualization.py", line 194, in wrap out = func(canvas, layout, *args, **kwargs) File "C:\Users\adityara\AppData\Roaming\Python\Python39\site-packages\layoutparser\visualization.py", line 479, in draw_text modified_box = ele.pad(right=text_box_width, bottom=text_box_width) AttributeError: 'str' object has no attribute 'pad'

lolipopshock commented 2 years ago

Thank you @rathorology ! I think this is a combination of lack of documentation as well as OCR API design issues. I am working on this and will let you know if there are any updates.