OCR-D processors are required to respect the AlternativeImage annotation of the METS/PAGE pair, cf. spec (ff.). That implies,
on the page level: do not just read@imageFilename, but prefer a AlternativeImage/@filename if it exists; regardless, write the result as a new PAGE, merely referencing the resulting image as additional AlternativeImage in the PageType and mets:file (in one of the OCR-D-IMG-* fileGrps) in METS
on the region level: do not just read@imageFilename and cut the respective region from it, but instead:
prefer a AlternativeImage/@filename (for the region) if it exists, or
prefer a AlternativeImage/@filename (for the page) if it exists and cut the region from it, otherwise
use @imageFilename and cut the respective region from it;
regardless, write the result as a new PAGE, merely referencing the resulting image as additional AlternativeImage in the RegionType and mets:file (in one of the OCR-D-IMG-* fileGrps) in METS
But mind that an effort is currently under way to incorporate a nice API for all that into core. (Because there is a lot more to it, cf. OCR-D/ocrd_tesserocr#33.) So I recommend waiting for the next release of ocrd first.
OCR-D processors are required to respect the
AlternativeImage
annotation of the METS/PAGE pair, cf. spec (ff.). That implies,@imageFilename
, but prefer aAlternativeImage/@filename
if it exists; regardless, write the result as a new PAGE, merely referencing the resulting image as additionalAlternativeImage
in thePageType
andmets:file
(in one of theOCR-D-IMG-*
fileGrps) in METSon the region level: do not just read
@imageFilename
and cut the respective region from it, but instead:AlternativeImage/@filename
(for the region) if it exists, orAlternativeImage/@filename
(for the page) if it exists and cut the region from it, otherwise@imageFilename
and cut the respective region from it;regardless, write the result as a new PAGE, merely referencing the resulting image as additional
AlternativeImage
in theRegionType
andmets:file
(in one of theOCR-D-IMG-*
fileGrps) in METS