RapidAI / RapidOCRPDF

Based on RapidOCR, extract the PDF content.
Apache License 2.0
131 stars 14 forks source link

pdfextractor 不能强制OCR pdf所有页面吗? #3

Closed yonglee7015 closed 11 months ago

yonglee7015 commented 1 year ago

我有些pdf页面是文字和扫描或者图片混合的,可以有参数设置每页都强制OCR识别吗?类似于ocrmypdf force_ocr的。

SWHL commented 1 year ago

这个后续我考虑加一下,欢迎提PR

SWHL commented 11 months ago

已经在rapidocr_pdf>=0.0.7中实现