When using the image below, OCR fails at the aggregation stage due to a bbox not being in a list where it is expected.
ocr_wrapper 1.0.0
Python 3.11.7
File "~/coworker-service/src/coworker/utils/preprocessing.py", line 42, in image_to_page_input
scan = await asyncio.to_thread(ocr_scanner.ocr, image)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/asyncio/threads.py", line 25, in to_thread
return await loop.run_in_executor(None, func_call)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/ocr_wrapper.py", line 98, in ocr
result = self._get_multi_response(img)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/ocr_wrapper.py", line 137, in _get_multi_response
response = aggregate_ocr_samples(responses, img.size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/aggregate_multiple_responses.py", line 170, in aggregate_ocr_samples
bbox_groups = _group_overlapping_bboxes(bboxes, 0.1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/aggregate_multiple_responses.py", line 66, in _group_overlapping_bboxes
working_bboxes.remove(overlapping_bbox)
ValueError: list.remove(x): x not in list
When using the image below, OCR fails at the aggregation stage due to a bbox not being in a list where it is expected. ocr_wrapper 1.0.0 Python 3.11.7