Closed bertsky closed 4 years ago
Sure it is. I will think about how to implement it, because i dont want to lose the option to add processed images e.g. binarized version instead of the original images.
In the OCR-D functional model, all PAGE annotations will always refer to the original image. Derived images are under AlternativeImage
only.
You could look at /PcGts/Page/AlternativeImage/@filename
for binarized/dewarped/deskewed etc images. But you have to make sure to re-calculate all coordinates then: any segment's @points
always refer to the original image under /Page/@imageFilename
in PAGE, but AlternativeImage
can be cropped (consistent with Border
), deskewed (consistent with @orientation
) or even dewarped (without information).
So maybe you can at least make the second input file group for images optional (and default to @imageFilename
), also avoiding the above strange error message when missing?
The errormsg is alread fixed. I will implemented the optional image param soon.
Thanks @JKamlah for making this great tool!
Would it be much effort to remove the requirement to have an explicit second input file group for the image? This should be just dereferenced from the
/Page/@imageFilename
in the PAGE file (relative to METS file path).Also,
line 35: in_grps[1]: unbound variable
is not a good error message IMO.