-
Is there any support for hocr planned?
-
### Current Behavior
I used tesseract 5.4.1 in WSL/Win10 and tesseract 5.0.1 in GImagereader/Win10 with different image files (fraktur newspaper and latin/Libreoffice dokument, 2 columns, all images…
-
same problem as https://github.com/ocropus/hocr-tools/issues/189 for this fork.
-
### Environment
* **Tesseract Version**: **4.0.0-beta.4-37-g115f**
* **Commit Number**: **4.0.0-beta.4-37-g115f**
* **Platform**: **x86_64 GNU/Linux**
### Current Behavior:
I want to repor…
-
Tesseract Version: v5.0.0-alpha.20190623
Platform: Windows 10 64-bit
Current Behavior: For the Thai language (almost) every individual character in hOCR output is a word
Expected Behavior: Words (…
FrkBo updated
2 weeks ago
-
For reproduction steps see the workaround for https://github.com/OurDigitalWorld/hocrmod/issues/1
On the right top is the text 'print' that still isn't found by this script.
```
python hocrmod.py…
rmast updated
3 months ago
-
I am trying to generate a searchable pdf from a jpeg file and a hocr file with the help of hocr-pdf.
I have both files in the same folder. `hocr-pdf . > out.pdf` generates a pdf but I cannot search…
-
Add download options, if a canvas has associated seeAlso resources, e.g. hOCR:
example manifest with hOCR:
https://api.digitale-sammlungen.de/iiif/presentation/v2/bsb11659582/manifest
```
"see…
-
[hOCR](https://www.wikiwand.com/en/HOCR) is an open standard of data representation for formatted text obtained from optical character recognition (OCR). The definition encodes text, style, layout inf…
-
```
Thanks for developing OCRFeeder.
Do you have any plans to include "export to hOCR" functionality?
Together with hOCR export and automatically stitching the files back into the
recognized PDF do…