-
The [iiif](http://iiif.io/) defines a [Presentation API](http://iiif.io/api/presentation/2.1/) that allows the representation of - where available - OCR results in ALTO as annotations, linked by a [ma…
cneud updated
6 years ago
-
As noted in the epic and in planning discussions, we will check works' FileSets to see if they contain an ALTO xml file (which has the file use of "Extracted"). That page-level XML file will contain t…
-
I might consider doing this with LECTAUREP, but I wonder what would be the best approach and how this would impact documenting the volumes and the dataset.
For example, I could do 2 different folde…
-
The exported ALTO and PAGE files are not valid XML. Validators complain, and the PRIMA PageViewer refuses to load such files. Tested example from the GT data set of ÖNB:
$ ocr-validate alto-2-0…
-
I need it for a down-stream XSLT pipeline;
https://gitlab.coko.foundation/XSweet/XSweet/-/tree/pdf2html/applications/pdf2html
Sukii updated
2 years ago
-
:\Pan_Cleaner>pan_analyzer --xml 4412.xml
palo_alto_firewall_analyzer - INFO - Running validators
palo_alto_firewall_analyzer.validators.bad_hostnames - INFO - **************************************…
-
Previously, we used pdfalto to generate an ALTO XML from the pdf and https://github.com/filak/hOCR-to-ALTO to convert the ALTO XML to hOCR file after that. With the newest release of pdfalto this does…
ghost updated
3 years ago
-
We'e run into a handful of ALTO files in which the text overlay font is too large, resulting in overlapping text regions and unreadable text.
Here's an example:
* Configure OSD with
`tileSour…
-
Il s'agirait d'utiliser Aspyre pour créer un système de conversion pour passer des ALTO XML (3) produits par le script [pdfalto](https://github.com/kermitt2/pdfalto) en intégrant les modifications né…
-
cf #95
I am targeting hocr and trying to do so from the ABBYY latest form of alto. The header for the latter is
```
pixel
2019-08-29ABBYYABBYY FineReader Engine12
...
```
But…
jtlz2 updated
4 years ago