Proof-of-concept for extensible evaluation of (intermediate) results of an OCR workflow
make deps install
All evaluation functionality is provided by backends.
Every backend inherits from EvalBackend
and must
implement a compare_files
method, that accepts paths to and media types of
the Ground Truth and detection results, does the actual evaluation and returns
an EvalReport
.
An EvalReport
is a map of metrics to their resp. value and can be serialized
as JSON or CSV for further processing/analysis.
The glue code for running the backends is in
ocrmultieval.runner.py
.
The ocrmultieval compare
command line tool allows evaluating individual pages of GT
and detection with any of the available backends.
Usage: ocrmultieval compare [OPTIONS] {dinglehopper|ocrevalUAtion|PrimaTextEva
l|CorAsvAnnEval|CorAsvAnnCompare|OcrdSegmentEvalua
te|IsriOcreval} GT_FILE OCR_FILE
Options:
--gt-mediatype TEXT
--ocr-mediatype TEXT
--format [csv|json|yaml|xml]
-g, --pageId TEXT pageId to uniquely identify pages in a work
--help Show this message and exit.
The ocrd-ocrmultieval
command line tool implments the OCR-D processor
API and can be used to process complete
workspaces.
Usage: ocrd-ocrmultieval [OPTIONS]
Evaluate
> Eval processor
Options:
-I, --input-file-grp USE File group(s) used as input
-O, --output-file-grp USE File group(s) used as output
-g, --page-id ID Physical page ID(s) to process
--overwrite Remove existing output pages/images
(with --page-id, remove only those)
-p, --parameter JSON-PATH Parameters, either verbatim JSON string
or JSON file path
-P, --param-override KEY VAL Override a single JSON object key-value pair,
taking precedence over --parameter
-m, --mets URL-PATH URL or file path of METS to process
-w, --working-dir PATH Working directory of local workspace
-l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
Log level
-C, --show-resource RESNAME Dump the content of processor resource RESNAME
-L, --list-resources List names of processor resources
-J, --dump-json Dump tool description as JSON and exit
-h, --help This help message
-V, --version Show version
Parameters:
"backend" [string - "PrimaTextEval"]
Backend to use
Possible values: ["PrimaTextEval", "ocrevalUAtion", "dinglehopper",
"OcrdSegmentEvaluate", "IsriOcreval", "CorAsvAnnCompare"]
"format" [string - "csv"]
Output format
Possible values: ["csv", "json", "yaml", "xml"]
"config" [object]
Configuration to override default
Default Wiring:
['GT,OCR1'] -> ['GT_VS_OCR1']