Open tiamjiakun opened 3 months ago
waiting
+1 on this. pdfminer struggles with a large amount of documents I'm testing with. pymupdf, on the other hand opens anything I throw at it flawlessly. ocr=true
will flip to use pymupdf, but has additional logic that makes it useful to OCR.
seems to be pdfminer: https://github.com/pdfminer/pdfminer.six/issues/1004 https://github.com/NixOS/nixpkgs/pull/339919
there's a fix now but you'll have to wait until it gets released, which could be a while.
Initial Checks
Description
example1.pdf example2.pdf
Example Code
Python, open-parse & OS Version