Closed jotpunktopunkt closed 1 month ago
Can you reproduce it on other public instances like https://stirlingpdf.io/ or https://pdf.adminforge.de/
and does it happen for all PDFs? Are you able to share it?
And which settings are you running for the OCR?
that are the settings i'm using i can't share the pdf file in question, but i will make a test-pdf and see if i can reproduce it, so i can share it i'll report back asap
and if it is of relevance: it's a proxmox vm, running dietpi, with 2 cores. docker and portainer agent installed. docker compose is being used withe the config above. i access stirling pdf via nginx proxy manager
interestingly, i get working documents, when i use the "correct skewed angle" option
maybe, if it is relevant, all pdfs "OCR'ed" so far are scans of papers...so yes, they could have been slightly skewed, but i don't know why that would trigger an exit code 4 and an invalid file
edit: would it help you if i'd send you one of the files? is there a way to pm it or such?
You can send it me on discord but no other way, What happens if you choose OCR mode as Forced? does that also work?
force ocr does work, thank you! but why is that? what's the difference or to what mistake on my side does the exit code 4 hint? should i prep files differently?
Im not actually sure what the cause is, normally its font issues and forced converts it into image before running OCR
Installation Method
Docker
The Problem
hi
thanks for putting so much and great effort into stirling pdf, it's great!
i have the default eng.traineddata and the 15mb large deu.traineddata installed
whenever i try to run OCR over a pdf, no matter the language (but most often i try the german package) i get an error stating exit code 4
the last line of the container log (portainer) states: WARNING ocrmypdf._pipelines._common - Output file: The generated PDF is INVALID
i have deleted every folder and have re-initialized the container, but to no avail and no difference i don't know where to keep looking to fix this and i don't know which logs exactly to provide
Version of Stirling-PDF
0.28.3
Last Working Version of Stirling-PDF
never
Page Where the Problem Occurred
OCR
Docker Configuration
Relevant Log Output
Additional Information
No response
Browsers Affected
Microsoft Edge
No Duplicate of the Issue