OCR-D / ocrmultieval

Extensible evaluation of (intermediate) results of an OCR workflow
4 stars 0 forks source link

OcrdSegmentEvaluate: ensure binarized image fits page #1

Open bertsky opened 2 years ago

bertsky commented 2 years ago

https://github.com/kba/ocrmultieval/blob/5de79f3021b48f83f9cb798a484fd472d21ed94b/ocrmultieval/backends/OcrdSegmentEvaluate.py#L21-L23

This does not cover the case where the binary image is itself cropped or deskewed, i.e. does not represent the full PAGE. It will run into an assertion failure if not satisfied. You better watch the @comments for cropped or deskewed. Either you find some image without them (i.e. binarized before cropping), or do not pass the binary image (which will effectively run without only-fg, i.e. on the full segment masks). Also, try to fetch an image without clipped (as these obviously distort the evaluation).

In ocrd_segment.evaluate.EvaluateSegmentation (the OCR-D wrapper) we add relative coordinates in this case (i.e. whatever is consistent with the binary image).

kba commented 2 years ago

u better watch the @comments for cropped or deskewed. Either you find some image without them (i.e. binarized before cropping), or do not pass the binary image (which will effectively run without only-fg, i.e. on the full segment masks). Also, try to fetch an image without clipped (as these obviously distort the evaluation).

Yeah, this is a trivial implementation I did for testing the only-fg implementation. Maybe we could duplicate the workspace.image_from_page feature selector/filter logic in get_AllAlternativeImage{,Paths} to accomodate this?

In ocrd_segment.evaluate.EvaluateSegmentation (the OCR-D wrapper) we add relative coordinates in this case (i.e. whatever is consistent with the binary image).

BTW I am not sure whether reimplementing the OCR-D interfaces for ocrd-segment-evaluate or ocrd-dinglehopper is sensible, since they already exist and take care of edge cases and special conditions already.

bertsky commented 2 years ago

Yeah, this is a trivial implementation I did for testing the only-fg implementation. Maybe we could duplicate the workspace.image_from_page feature selector/filter logic in get_AllAlternativeImage{,Paths} to accomodate this?

The point in Workspace.image_from_page is that it can compute images ad-hoc for the annotated coordinates (e.g. crop+deskew a binary image). And sometimes the information isn't even there: For example, if the binarization ran after cropping and deskewing, you simply don't full-page binary images – you must run in the relative coordinate system.

In short, I would'nt know how to do it in another API – but perhaps I cannot tell the wood from the trees here.

BTW I am not sure whether reimplementing the OCR-D interfaces for ocrd-segment-evaluate or ocrd-dinglehopper is sensible, since they already exist and take care of edge cases and special conditions already.

Yes, it's probably easier to create OCR-D workspaces as disposables where we don't have suitable standalone CLIs.

kba commented 2 years ago

In short, I would'nt know how to do it in another API – but perhaps I cannot tell the wood from the trees here.

No, you're right, I was only thinking about the @comments attribute logic but not the image manipulations those @comments like cropped entails :/