OCR-D / spec

Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)
https://ocr-d.de/en/spec/
17 stars 5 forks source link

GT-level specific metrics #239

Open kba opened 1 year ago

kba commented 1 year ago

_Originally posted by @bertsky in https://github.com/OCR-D/spec/pull/225#discussion_r1086173671_

Speaking of: IMHO it would be quite relevant to offer a CER metric under level-2 (or even level-1) equivalency. Not exclusively (because this is not standard), but as a complementary variant.

Either by normalising both sides to OCR-D GT level 2 (or 1). Or by passing equivalence classes (zero edit cost rules) to the distance metric.

For example, naively, an umlaut error (u instead of ü or ), or a punctuation error (" instead of ), will have the same cost as any other error. But they might not be as relevant as others.