Closed DeepKariaX closed 3 months ago
we got the same issue too. is there any solution ?
@vav1lo Currently, I have changed to another reader. Also can you attach the pdf which you are testing coz mine is bit confidential to share and with a sample pdf it would be easy for them to diagnose the error.
Hi @vav1lo, Can you please attach the pdf that you are testing?
uber_10q_march_2022.pdf same problem with this file
`import os from unstructured.partition.pdf import partition_pdf from unstructured.staging.base import elements_to_json
filename = "uber_10q_march_2022.pdf"
elements = partition_pdf( filename=filename, strategy="hi_res", infer_table_structure=True, model_name="yolox", )`
@christinestraub
@christinestraub Here is the pdf that i am testing 1b4c03d6-f6f5-462d-8bd6-0b9e411bc33d.pdf
I am also getting the error while partitioning pdf , and the error is with particularly this argument infer_table_structure=True,
9 import torch
10 import transformers
---> 11 from cv2.typing import MatLike 12 from PIL.Image import Image 13 from transformers import DonutProcessor, VisionEncoderDecoderModel
ModuleNotFoundError: No module named 'cv2.typing'; 'cv2' is not a package
I am also getting the error while partitioning pdf , and the error is with particularly this argument infer_table_structure=True,
9 import torch 10 import transformers
---> 11 from cv2.typing import MatLike 12 from PIL.Image import Image 13 from transformers import DonutProcessor, VisionEncoderDecoderModel
ModuleNotFoundError: No module named 'cv2.typing'; 'cv2' is not a package
I think this has to do with the opencv installation
we got the same issue too. is there any solution ?
This started happening to me when I upgraded from 0.12.6 to 0.14.6
I am also getting the error while partitioning pdf , and the error is with particularly this argument infer_table_structure=True,
9 import torch 10 import transformers
---> 11 from cv2.typing import MatLike 12 from PIL.Image import Image 13 from transformers import DonutProcessor, VisionEncoderDecoderModel ModuleNotFoundError: No module named 'cv2.typing'; 'cv2' is not a package
I think this has to do with the opencv installation
i installed it as well, but what is being imported there needs to be changed actually
Hi @DeepKariaX, @vav1lo, @hackpointt, @Nidhi2497, @nikklavzar
Addressed on https://github.com/Unstructured-IO/unstructured-inference/pull/359. You'll need to upgrade unstructured-inference
to 0.7.36. I tested your code with the provided pdf documents and it worked as expected.
Closing this since it's assumed to be resolved, but feel free to reopen if you're still having this issue.
@christinestraub This is resolved, thanks !
Describe the bug Giving (ValueError: max() arg is an empty sequence) error when using partition pdf. When i keep the infer_table_structure = True parameter it is giving me this error and after removing this parameter it is working perfectly.
File which received bug unstructured_inference/models/tables.py", line 667, in fill_cells table_rows_no = max({row for cell in cells for row in cell["row_nums"]})
Expected behavior Even if we keep the infer_table_structure = True parameter it should be able to partition the pdf without any errors. (Maybe add error handling when receiving the none value)