xavctn / img2table

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing
MIT License
577 stars 76 forks source link

rotated pdf not work #206

Open Juan-hwt opened 4 months ago

Juan-hwt commented 4 months ago

i-2_清新农商行.pdf

from img2table.document import PDF
from img2table.ocr import PaddleOCR
src='i-2_清新农商行.pdf'
dest='i-2_清新农商行.xlsx'

doc = PDF(
    src,
    detect_rotation=True,
    pdf_text_extraction=True
)

ocr = PaddleOCR(lang="ch")

doc.to_xlsx(dest=dest,
            ocr=ocr,
            implicit_rows=False,
            borderless_tables=True,
            min_confidence=50)

I have setted detect_rotation, but it not work ,can not rotate automatically

How can I fix it?