deepdoctection / deepdoctection

A Repo For Document AI
Apache License 2.0
2.35k stars 117 forks source link

torch.max_pool2d #338

Open RK9534 opened 1 month ago

RK9534 commented 1 month ago

I am attempting to implement the provided example on my PDF file but have encountered an error. I have installed all the dependencies specified in the setup.py file. Below is the code I am using:

path="/path/to/dir/sample/2312.13560.pdf"

analyzer =dd.get_dd_analyzer(config_overwrite= ["PT.LAYOUT.WEIGHTS=microsoft/table-transformer-detection/pytorch_model.bin", "PT.ITEM.WEIGHTS=microsoft/table-transformer-structure-recognition/pytorch_model.bin", "PT.ITEM.FILTER=['table']", "OCR.USE_DOCTR=True", "OCR.USE_TESSERACT=False", "TEXT_ORDERING.INCLUDE_RESIDUAL_TEXT_CONTAINER=True", ])

analyzer.pipe_component_list[0].predictor.config.threshold = 0.4
df = analyzer.analyze(path=path) df.reset_state()

Till here its fine. dp = next(iter(df)) after this error: [0517 09:11.58 @doctectionpipe.py:84] INF Processing 2312.13560.pdf [0517 09:12.02 @context.py:126] INF ImageLayoutService total: 1.9849 sec. [0517 09:12.03 @context.py:126] INF SubImageLayoutService total: 1.5151 sec. [0517 09:12.03 @context.py:126] INF PubtablesSegmentationService total: 0.0409 sec. [0517 09:12.09 @context.py:126] INF ImageLayoutService total: 5.6937 sec.

RuntimeError Traceback (most recent call last) in <cell line: 1>() ----> 1 dp = next(iter(df)) 2 np_image = dp.viz()

23 frames /usr/local/lib/python3.10/dist-packages/torch/nn/functional.py in _max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode, return_indices) 794 if stride is None: 795 stride = torch.jit.annotate(List[int], []) --> 796 return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) 797 798

RuntimeError: Given input size: (128x1x16). Calculated output size: (128x0x8). Output size is too small

I'm using Google Colab as an environment.

JaMe76 commented 1 month ago

It looks that there is a problem with the image input size. Does the image have three channels and is it reasonably large (e.g. at least 600 px )?

JaMe76 commented 2 weeks ago

Duplicate to #345