deepopinion / ocr_wrapper

A Python wrapper for multiple OCR solutions
MIT License
7 stars 0 forks source link

Multi-response aggregation fails with `ValueError: x not in list` #9

Closed phschoepf closed 5 months ago

phschoepf commented 9 months ago

When using the image below, OCR fails at the aggregation stage due to a bbox not being in a list where it is expected. ocr_wrapper 1.0.0 Python 3.11.7

  File "~/coworker-service/src/coworker/utils/preprocessing.py", line 42, in image_to_page_input
    scan = await asyncio.to_thread(ocr_scanner.ocr, image)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/ocr_wrapper.py", line 98, in ocr
    result = self._get_multi_response(img)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/ocr_wrapper.py", line 137, in _get_multi_response
    response = aggregate_ocr_samples(responses, img.size)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/aggregate_multiple_responses.py", line 170, in aggregate_ocr_samples
    bbox_groups = _group_overlapping_bboxes(bboxes, 0.1)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/coworker-service/.venv/lib/python3.11/site-packages/ocr_wrapper/aggregate_multiple_responses.py", line 66, in _group_overlapping_bboxes
    working_bboxes.remove(overlapping_bbox)
ValueError: list.remove(x): x not in list

debug

Paethon commented 5 months ago

The 1.x branch has been deprecated