-
PaddleOCR seems to be very nice way to OCR documents.
There is project called ocrmypdf https://github.com/ocrmypdf/OCRmyPDF which has plugin system, where HOCR -compliant OCR engines can be integra…
-
### What happened?
the fmriprep run on a single subject completed with an error specific to the FSL msm command.
I realized that the configuration parameters used in site-packages/smriprep/data/msm/…
-
### Describe the bug
I tried to OCR a PDF file. However, the OCR couldn't be completed.
### Steps to reproduce
```plain text
1. Run ocrmypdf -v1 --max-image-mpixels 1000 --output-type pdf test.pdf …
-
hello,
i get an error when trying to `Cleanup Scans / OCR` with ` german language`
i got my deu.traindata from here [https://github.com/tesseract-ocr/tessdata_fast/blob/main/deu.traineddata](htt…
-
I'm making this issue to try and formalise what should and shouldn't be in a CProject. Since the main interface between parts of the software is the filesystem tree of a CProject there needs to be a s…
-
### Description of the bug
In more recent versions of PyMuPDF, the contents stream can contain (invalid for PDF?) floating point numbers in scientific notation.
For example, these are generated …
-
**Overview of Use Case**
I want to make testing a PR for Drupal modules as easy as possible. When we used the (now deprecated) `islandora/` namespace, it was easy to require a different branch bec…
-
**Is your feature request related to a problem? Please describe.**
My use case is "scanning" documents with a smartphone camera, then archiving those "scans" as low-quality monochrome images. But OCR…
-
Here, i'm going to raise some issues related to Tesseract's Hebrew support.
Dear participants who have interest in Arabic support, I suggest to raise Arabic issues in a separate 'issue', even if th…
-
Python 2 will be EOL end of 2019. Distributions will stop shipping it. https://pythonclock.org/