-
Unfortunately scribeOCR seems to have changed .hocr processing in the last two weeks. I can't import/load my PDF with the .hocr -file anymore. Importing the Image containing PDF works. But not with th…
-
### Describe the bug
The generated PDF file has black coloured boxes in place of the images.
### Steps to reproduce
```plain text
1. Run ocrmypdf -v1 --output-type pdf --max-image-mpixels 1000 --te…
-
I suggest to focus on 5.x for 2022 at least.
That means we should not break the API (and ABI?). Use C++17, not C++20/C++23.
-
### What were you trying to do?
with tesseract 5.4.0 (released 2 days ago) ocrmypdf crashes with `SubprocessOutputError`; tried with multiple pdfs; downgraded to tesseract 5.3.4 and everything is f…
mplx updated
5 months ago
-
# What is this?
We got a lot of requests for this so its about time. This issue is to explain the design and how I plan on doing this, happy to get feedback, feature request, questions, etc.
# H…
-
### Descriptive summary
Search results for works with large (h)OCR/extracted text load unacceptably slow. This is best seen when searching the OSU General Catalog collection. These works are dense ar…
-
报错日志:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ocrmypdf.io.nkjxda86/000007_ocr_hocr.hocr'
at stirling.software.SPDF.utils.ProcessExecutor.runCommandWithOutputHandling(ProcessEx…
-
### Environment
* **Tesseract Version**:
tesseract --version
tesseract 4.0.0-beta.1
leptonica-1.76.0 (Jun 26 2018, 18:21:40) [MSC v.1900 LIB Release x86]
libgif 5.1.4…
-
Hi,
In my thesis I want to compare MSM_strain. so I want to know how to use MSM_strain. the way I understand how to use it is as follows:
1. download the executable file https://github.com/ecr05/MS…
-
### Describe the bug
Rare error on an Adobe InDesign 18.0 file (Macintosh)
### Steps to reproduce
```plain text
$ocrmypdf -v1 --pdf-renderer hocr --output-type pdf -O2 --jbig2-lossy --skip-t…