felixdittrich92 / OnnxTR

OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR
https://github.com/mindee/doctr
Apache License 2.0
17 stars 1 forks source link

Unexpected keyword "rec_arch" in models/builder.py #5

Closed nospotfer closed 2 months ago

nospotfer commented 2 months ago

Bug description

DocumentBuilder expects no "rec_arch", but it is passed through kwargs.

Code snippet to reproduce the bug

from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor

model = ocr_predictor(
    det_arch='db_resnet50',  # detection architecture
    rec_arch='crnn_mobilenet_v3_large',  # recognition architecture
    det_bs=4, # detection batch size
    reco_bs=1024, # recognition batch size
    assume_straight_pages=True,  # set to `False` if the pages are not straight (rotation, perspective, etc.) (default: True)
    straighten_pages=False,  # set to `True` if the pages should be straightened before final processing (default: False)
    # Preprocessing related parameters
    preserve_aspect_ratio=True,  # set to `False` if the aspect ratio should not be preserved (default: True)
    symmetric_pad=True,  # set to `False` to disable symmetric padding (default: True)
    # Additional parameters - meta information
    detect_orientation=False,  # set to `True` if the orientation of the pages should be detected (default: False)
    detect_language=False, # set to `True` if the language of the pages should be detected (default: False)
    # DocumentBuilder specific parameters
    resolve_lines=True,  # whether words should be automatically grouped into lines (default: True)
    resolve_blocks=True,  # whether lines should be automatically grouped into blocks (default: True)
    paragraph_break=0.035,  # relative length of the minimum space separating paragraphs (default: 0.035)
)

Error traceback

  warnings.warn(
Traceback (most recent call last):
  File "/home/gabriel/git/OnnxTR/demo/app.py", line 6, in <module>
    model = ocr_predictor(
            ^^^^^^^^^^^^^^
  File "/home/gabriel/git/OnnxTR/onnxtr/models/zoo.py", line 103, in ocr_predictor
    return _predictor(
           ^^^^^^^^^^^
  File "/home/gabriel/git/OnnxTR/onnxtr/models/zoo.py", line 43, in _predictor
    return OCRPredictor(
           ^^^^^^^^^^^^^
  File "/home/gabriel/git/OnnxTR/onnxtr/models/predictor/predictor.py", line 57, in __init__
    _OCRPredictor.__init__(
  File "/home/gabriel/git/OnnxTR/onnxtr/models/predictor/base.py", line 50, in __init__
    self.doc_builder = DocumentBuilder(**kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: DocumentBuilder.__init__() got an unexpected keyword argument 'rec_arch'

Environment

OS: Ubuntu 22.04 Python version: 3.11 Library version: Onnxruntime version: 1.17

nospotfer commented 2 months ago

I managed to do a workaround by simply adding **kwargs to DocumentBuilder's init() like this:

def __init__(
        self,
        resolve_lines: bool = True,
        resolve_blocks: bool = True,
        paragraph_break: float = 0.035,
        export_as_straight_boxes: bool = False,
        **kwargs: Any

But perhaps there's a more elegant solution

felixdittrich92 commented 2 months ago

Hey @nospotfer :wave:,

Thanks for reporting could you please add a full working code snippet to reproduce the error ? :)

felixdittrich92 commented 2 months ago

@nospotfer sry it's a typo in the readme xD

reco_arch instead of rec_arch

felixdittrich92 commented 2 months ago

updated :+1:

nospotfer commented 2 months ago

Great! In any case, here's the snippet I used:

from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor

doc = DocumentFile.from_pdf("/path/to/pfd_file.pdf")

model = ocr_predictor(
    det_arch='db_resnet50',  # detection architecture
    rec_arch='crnn_mobilenet_v3_large',  # recognition architecture
    det_bs=4,  # detection batch size
    reco_bs=1024,  # recognition batch size
    assume_straight_pages=True,  # set to `False` if the pages are not straight (rotation, perspective, etc.) (default: True)
    straighten_pages=False,  # set to `True` if the pages should be straightened before final processing (default: False)
    preserve_aspect_ratio=True,  # set to `False` if the aspect ratio should not be preserved (default: True)
    symmetric_pad=True,  # set to `False` to disable symmetric padding (default: True)
    # DocumentBuilder specific parameters
    resolve_lines=True,  # whether words should be automatically grouped into lines (default: True)
    resolve_blocks=True,  # whether lines should be automatically grouped into blocks (default: True)
    paragraph_break=0.035,  # relative length of the minimum space separating paragraphs (default: 0.035)
)

ocr_response = model(doc)
json_response = ocr_response.export()
text_response = ocr_response.render()

Just in case it can be useful for someone else

felixdittrich92 commented 2 months ago

Added also a small first small benchmark at the readme bottom :)

nospotfer commented 2 months ago

The issue is still present with 0.1.2.

Steps to reproduce:

from onnxtr.io import DocumentFile
from onnxtr.models import ocr_predictor

model = ocr_predictor(
    det_arch='db_resnet50',  # detection architecture
    rec_arch='crnn_mobilenet_v3_large',  # recognition architecture
    det_bs=4, # detection batch size
    reco_bs=1024, # recognition batch size
    assume_straight_pages=True,  # set to `False` if the pages are not straight (rotation, perspective, etc.) (default: True)
    straighten_pages=False,  # set to `True` if the pages should be straightened before final processing (default: False)
    # Preprocessing related parameters
    preserve_aspect_ratio=True,  # set to `False` if the aspect ratio should not be preserved (default: True)
    symmetric_pad=True,  # set to `False` to disable symmetric padding (default: True)
    # Additional parameters - meta information
    detect_orientation=False,  # set to `True` if the orientation of the pages should be detected (default: False)
    detect_language=False, # set to `True` if the language of the pages should be detected (default: False)
    # DocumentBuilder specific parameters
    resolve_lines=True,  # whether words should be automatically grouped into lines (default: True)
    resolve_blocks=True,  # whether lines should be automatically grouped into blocks (default: True)
    paragraph_break=0.035,  # relative length of the minimum space separating paragraphs (default: 0.035)
)

Traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 4
      1 from onnxtr.io import DocumentFile
      2 from onnxtr.models import ocr_predictor
----> 4 model = ocr_predictor(
      5     det_arch='db_resnet50',  # detection architecture
      6     rec_arch='crnn_mobilenet_v3_large',  # recognition architecture
      7     det_bs=4, # detection batch size
      8     reco_bs=1024, # recognition batch size
      9     assume_straight_pages=True,  # set to `False` if the pages are not straight (rotation, perspective, etc.) (default: True)
     10     straighten_pages=False,  # set to `True` if the pages should be straightened before final processing (default: False)
     11     # Preprocessing related parameters
     12     preserve_aspect_ratio=True,  # set to `False` if the aspect ratio should not be preserved (default: True)
     13     symmetric_pad=True,  # set to `False` to disable symmetric padding (default: True)
     14     # Additional parameters - meta information
     15     detect_orientation=False,  # set to `True` if the orientation of the pages should be detected (default: False)
     16     detect_language=False, # set to `True` if the language of the pages should be detected (default: False)
     17     # DocumentBuilder specific parameters
     18     resolve_lines=True,  # whether words should be automatically grouped into lines (default: True)
     19     resolve_blocks=True,  # whether lines should be automatically grouped into blocks (default: True)
     20     paragraph_break=0.035,  # relative length of the minimum space separating paragraphs (default: 0.035)
     21 )

File ~/anaconda3/envs/unsloth/lib/python3.10/site-packages/onnxtr/models/zoo.py:103, in ocr_predictor(det_arch, reco_arch, assume_straight_pages, preserve_aspect_ratio, symmetric_pad, export_as_straight_boxes, detect_orientation, straighten_pages, detect_language, **kwargs)
     56 def ocr_predictor(
     57     det_arch: Any = "fast_base",
     58     reco_arch: Any = "crnn_vgg16_bn",
   (...)
     66     **kwargs: Any,
     67 ) -> OCRPredictor:
     68     """End-to-end OCR architecture using one model for localization, and another for text recognition.
     69 
     70     >>> import numpy as np
   (...)
    101         OCR predictor
    102     """
--> 103     return _predictor(
    104         det_arch,
    105         reco_arch,
    106         assume_straight_pages=assume_straight_pages,
    107         preserve_aspect_ratio=preserve_aspect_ratio,
    108         symmetric_pad=symmetric_pad,
    109         export_as_straight_boxes=export_as_straight_boxes,
    110         detect_orientation=detect_orientation,
    111         straighten_pages=straighten_pages,
    112         detect_language=detect_language,
    113         **kwargs,
    114     )

File ~/anaconda3/envs/unsloth/lib/python3.10/site-packages/onnxtr/models/zoo.py:43, in _predictor(det_arch, reco_arch, assume_straight_pages, preserve_aspect_ratio, symmetric_pad, det_bs, reco_bs, detect_orientation, straighten_pages, detect_language, **kwargs)
     37 # Recognition
     38 reco_predictor = recognition_predictor(
     39     reco_arch,
     40     batch_size=reco_bs,
     41 )
---> 43 return OCRPredictor(
     44     det_predictor,
     45     reco_predictor,
     46     assume_straight_pages=assume_straight_pages,
     47     preserve_aspect_ratio=preserve_aspect_ratio,
     48     symmetric_pad=symmetric_pad,
     49     detect_orientation=detect_orientation,
     50     straighten_pages=straighten_pages,
     51     detect_language=detect_language,
     52     **kwargs,
     53 )

File ~/anaconda3/envs/unsloth/lib/python3.10/site-packages/onnxtr/models/predictor/predictor.py:57, in OCRPredictor.__init__(self, det_predictor, reco_predictor, assume_straight_pages, straighten_pages, preserve_aspect_ratio, symmetric_pad, detect_orientation, detect_language, **kwargs)
     55 self.det_predictor = det_predictor
     56 self.reco_predictor = reco_predictor
---> 57 _OCRPredictor.__init__(
     58     self, assume_straight_pages, straighten_pages, preserve_aspect_ratio, symmetric_pad, **kwargs
     59 )
     60 self.detect_orientation = detect_orientation
     61 self.detect_language = detect_language

File ~/anaconda3/envs/unsloth/lib/python3.10/site-packages/onnxtr/models/predictor/base.py:50, in _OCRPredictor.__init__(self, assume_straight_pages, straighten_pages, preserve_aspect_ratio, symmetric_pad, **kwargs)
     48 self.straighten_pages = straighten_pages
     49 self.crop_orientation_predictor = None if assume_straight_pages else crop_orientation_predictor()
---> 50 self.doc_builder = DocumentBuilder(**kwargs)
     51 self.preserve_aspect_ratio = preserve_aspect_ratio
     52 self.symmetric_pad = symmetric_pad

TypeError: DocumentBuilder.__init__() got an unexpected keyword argument 'rec_arch'

onnxTR version: 0.1.2

felixdittrich92 commented 2 months ago

@nospotfer reco_arch not rec_arch :) Take a look i had updated the readme :)

nospotfer commented 2 months ago

You're completely right! It works like a charm. Thank you!

felixdittrich92 commented 2 months ago

I have already planned to experiment next week a bit with onnxruntime available optimization and 8-Bit quantization to further boost the inference latency on CPU :)

nospotfer commented 2 months ago

I'll be very happy to give it a try! In any case, docTR cpu in onnx without quantization is already blazing fast.

El vie, 10 may 2024, 18:18, Felix Dittrich @.***> escribió:

I have already planned to experiment next week a bit with onnxruntime available optimization and 8-Bit quantization to further boost the inference latency on CPU :)

— Reply to this email directly, view it on GitHub https://github.com/felixdittrich92/OnnxTR/issues/5#issuecomment-2104880536, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC3TQYZSXXJAY2MJFEQMKPTZBTXNBAVCNFSM6AAAAABHQPWQOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBUHA4DANJTGY . You are receiving this because you were mentioned.Message ID: @.***>

felixdittrich92 commented 2 months ago

@nospotfer 8-bit available now: https://github.com/felixdittrich92/OnnxTR/releases/tag/v0.2.0 Feel free to test it 🤗

nospotfer commented 2 months ago

I'll give it a try as soon as I can!