-
I have cloned the repository, successfully compiled the pdfalto tool as instructed in the readme and processed a pdf file to get a few files as output, including an xml that appears to be an alto xml …
-
Hello,
Could you please add the Sloane Lab HTR Model to the HTR United repository?
Many thanks and best wishes
Marco
Here is our dataset YAML file:
```yaml
schema: https://htr-united.git…
-
ALTO (also true for MODS, METS and EAD) uses a LoC version of xlink.xsd (http://www.loc.gov/standards/xlink/xlink.xsd) but with a w3c namespace (https://www.w3.org/1999/xlink.xsd). When validating mi…
-
### Terraform Core Version
v1.1.2
### AWS Provider Version
5.6.2
### Affected Resource(s)
aws_s3
### Expected Behavior
All object older than once day should get deleted which are …
-
Il s'agirait d'utiliser Aspyre pour créer un système de conversion pour passer des ALTO XML (3) produits par le script [pdfalto](https://github.com/kermitt2/pdfalto) en intégrant les modifications né…
-
Hello !
Here is our dataset YAML file:
```yaml
schema: https://htr-united.github.io/schema/2023-06-27/schema.json
title: Oneirocriticon-Latinum
url: https://github.com/Lithos-Elaia/Oneirocri…
-
Hei!
I tried to run something like
```
java -cp ocrevaluation.jar eu.digitisation.Main \
-gt {ground_truth_file} [{encoding}] \
-ocr {ocr_file} [encoding] \
-d {output_directory} [-r…
-
Hello HTR-united team!
please consider the following data set description for inclusion in your directory.
Here is our dataset YAML file:
```yaml
schema: https://htr-united.github.io/schem…
-
Previously, we used pdfalto to generate an ALTO XML from the pdf and https://github.com/filak/hOCR-to-ALTO to convert the ALTO XML to hOCR file after that. With the newest release of pdfalto this does…
ghost updated
3 years ago
-
```
$ ocr-transform hocr alto2.1 in.html out.xml
Error
SXXP0005: The source document is in namespace http://www.w3.org/1999/xhtml, but all the
template rules match elements in no namespace (Us…