Closed kba closed 1 year ago
LGTM!
I've merged all your proposals AFAICT.
For the future please modify only the YAML files, the JSON files are generated from them. I can also generate the JSON in a non-pretty-printed format to reduce confusion.
Yeah, I noticed that too – too late. :zany_face: Will do in the future!
Just to make it as clear as possible: a character regarding these definitions is a glyph? Something printable visual, a graphical representation of a character? Saying so, any special whitespace codepoint (spatium, tab, zero-width spatium, invisible times, ... ) is not a character regarding OCR-D QA?
IMHO this is quite reasonable.
This doesn't apply to word-based metrics. But since usually structured GT shall be the backbone for evaluation, word boundaries or words at all are present already in the data if it is present at least on word level.
Since this implies concerning character-based textual evaluation to strip off any spaces forehand, it should be cleared on which level (line with spaces or finer) both GT and related candidate data are available.
If GT is for whatever reasons only on line-level present, I assume that these spaces are normalized too or even inserted by some legacy tooling meaning there's no reason either to keep these code points either.
Due to ongoing changes in the ocrd_eval schema I think we should omit those changes for this PR to separate the different issues that this PR is trying to solve: defining a first draft of metrics definitions and creating a JSON schema for the Quiver API. Since the API is still at a very early stage I'm not quite sure if it generates that much value for us if we create a spec right now.
Due to ongoing changes in the ocrd_eval schema I think we should omit those changes for this PR to separate the different issues that this PR is trying to solve: defining a first draft of metrics definitions and creating a JSON schema for the Quiver API. Since the API is still at a very early stage I'm not quite sure if it generates that much value for us if we create a spec right now.
I second that. Since the requirements for the UI are not completely clear yet, we should move the JSON schema for the data to be delivered by the back end to a separate branch.
I second that. Since the requirements for the UI are not completely clear yet, we should move the JSON schema for the data to be delivered by the back end to a separate branch.
Fine with me.
Schema changes now in https://github.com/OCR-D/spec/pull/236
Merged and wil release it later. There are still open questions and "postponed" metrics but it is an excellent first version we can and will build upon.
If I missed an unresolved discussion or some aspect that should be tracked in a dedicated issue, please let me know and/or open an issue.
The only open discussion is the one about BoW metrics. It's still somewhat valuable because it shows which implementations use which definitions. (Or should we add this to our Evaluation Wiki page?)
This pull request offers our first draft for the QA Specs. It consists of two main parts:
ocrd_eval.md
(which is equal to https://pad.gwdg.de/rLDBVhmYQ8CwOd67KcYHwQ#)the schema for the file format we want to use e.g. for the benchmarking,cf. https://github.com/OCR-D/spec/pull/236ocrd_eval.sample.yml