-
Hello,
## Actual Behavior
I'm facing a NPE when I try to export Transcriptions from local Files/Images as ALTO + PAGE with Transkribus 1.9.1 Client GUI and there's no data in the chosen export-D…
-
Al validar la factura
table
{mso-displayed-decimal-separator:"\,";
mso-displayed-thousand-separator:"\.";}
tr
{mso-height-source:auto;}
col
{mso-width-source:auto;}
td
{…
-
```
$ ocr-transform hocr alto2.1 in.html out.xml
Error
SXXP0005: The source document is in namespace http://www.w3.org/1999/xhtml, but all the
template rules match elements in no namespace (Us…
-
Hi! rusty-tesseract is amzaing work! It works pretty well on my both Linux and MacOS machine!
I have used it on my personal project https://github.com/strrl/dejavu, and I found that I require mor…
-
## Link to OCR from a Manifest
## Use case
I have a file containing OCR text for an item, and would like to make it available via the IIIF manifest.
## Background
We'd like to create a recip…
-
Bolo by možné pri textovom prepise stránky čerpať z datastramu ALTO? Teraz je to myslím vytiahnuté z TEXT_OCR ale momentálne uvažujeme, že tohto datastreamu sa zbavíme, keďže to isté sa nachádza aj v …
-
**Is your feature request related to a problem? Please describe.**
The first set of digitization content will be the Folk Fest Programs. This set was chosen because it is small (there are 36) and wel…
-
for reproducibility, it would be nice to have a checksum and/or file size of models used in XML.
-
In PAGE-XML there's `@language` / `@primaryLanguage` of type `pc:LanguageSimpleType` to identify the natural language of segments. Its documentation refers to `ISO 639.x 2016-07-14`, which I cannot ma…
-
For the example `471433v1` (from bioRxiv 10k training dataset), there is an image that is extracted with a blank overlay, with the same coordinates as the true image figure.
In a PDF viewer the doc…