Open bertsky opened 11 months ago
At least for ocrd-tesserocr-recognize, we can also parameterise dynamically (via XPath queries): xpath_model
e.g.
{
"contains('de,deu,ger',@language) and starts-with(@script,'Latf')": "frak2021",
"contains('fr,fre,fra',@language)": "fra",
"@language='hsb') and starts-with(@script,'Latf')": "hsbfraktur",
"@language='hsb')": "hsblatin",
"": "eng"
}
And as workaround for the missing MODS inheritance, we could simply write a dummy processor that fills the respective PAGE attributes from MODS...
For example, with ocrd-tesserocr-recognize we could do something like:
The question is: do we only apply this when no
--workflow
is supplied, or should we assume that all workflow files themselves may contain placeholders, e.g.$TESSMODEL
, which we must replace on the fly?