-
**Describe the bug**
As title described, --redo-ocr doesn't remove previous ocr layer made by ocrmypdf
**To Reproduce**
Steps to reproduce the behavior.
```
ocrmypdf "in.pdf" "out.pdf" --out…
-
**Describe the bug**
When calling ocrmypdf 14.2.0 on the example file, ghostscript gets called with the resolution parameter set to `-r1.209464x1.209464`, which leads to an error `Unrecoverable error…
-
I realize this may be thoroughly outside the intended scope of this project, but it would be wonderful if it would process not just PDF files, but a variety of image files (tiff and jpg come to mind).…
-
Tesseract?
- [ ] Installation von Tesseract ggf. Nutzer überlassen, aber eine Konfiguration für die Anbindung bereitstellen
- [x] #2294
- [ ] OCR für Dokumente in der Akte - auf Nutzeraktion
- […
-
I believe Adobe Acrobat Pro has something like this.
I could imagine there being an "OCR" option in the menu that could download/cache tesseract.wasm run the OCR and then be able to produce a new P…
-
### What were you trying to do?
OCR a PNG with default arguments. That produces a PDF with a JPEG in it (which is counter-intuitive, see https://github.com/ocrmypdf/OCRmyPDF/issues/1124).
However,…
-
-
**Describe the bug**
Error report in post processing of PDF that contains png and Group4 tiff files. I can't tell which images are giving a problem, but there are two error reports and two png files …
-
It's hard to do this with regexs, so it might be worth using openCV (as per [this](https://stackoverflow.com/questions/67516273/remove-header-and-footer-from-pdftotext-module-in-python) post, which fu…
-
Hi
I’ve creaged the following Docker Compose file,
````
version: '2'
services:
scan-to-paperless:
image: sbrunner/scan-to-paperless
container_name: scan-to-paperless
…