OpenPecha / Toolkit

🛠 Tools to create, edit and export texts and annotations
https://toolkit.openpecha.org
Apache License 2.0
7 stars 4 forks source link

crash in create opf from Google Vision #274

Open eroux opened 4 weeks ago

eroux commented 4 weeks ago

See this stacktrace:

File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 722, in create_opf
base_text, layers, word_confidence_list = self.build_base(image_group_id)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 594, in build_base
self.build_page(bboxes, image_number+1, image_filename, state, avg_char_width)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 488, in build_page
sorted_bboxes = self.sort_bboxes(flatten_bboxes)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 300, in sort_bboxes
avg_box_height = self.get_avg_bbox_height(main_region_bboxes)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 156, in get_avg_bbox_height
avg_height = height_sum / bboxeswidth
ZeroDivisionError: division by zero
eroux commented 4 weeks ago

with yesterday's fix, here's the new error:

Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 727, in create_opf
base_text, layers, word_confidence_list = self.build_base(image_group_id)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 599, in build_base
self.build_page(bboxes, image_number+1, image_filename, state, avg_char_width)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 493, in build_page
sorted_bboxes = self.sort_bboxes(flatten_bboxes)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 310, in sort_bboxes
sorted_bbox_centriods = self.get_bbox_sorted_on_x(sort_on_y_bboxs, avg_box_height, bboxes)
File "/usr/local/lib/python3.9/site-packages/openpecha/formatters/ocr/ocr.py", line 271, in get_bbox_sorted_on_x
prev_bbox = bboxes_sorted_on_y[0]
IndexError: list index out of range