syedsaqibbukhari / docanalysis

Apache License 2.0
10 stars 5 forks source link

CLI diverges from specs #19

Closed kba closed 5 years ago

kba commented 5 years ago

For example, ocrd-anybaseocr-binarize --help

usage:
    Image binarization using non-linear processing.

            python ocrd-anyBaseOCR-binarize.py -m (mets input file path) -I (input-file-grp name) -O (output-file-grp name) -w (Working directory)

    This is a compute-intensive binarization method that works on degraded
    and historical book pages.

       [-h] [-p PARAMETER] [-w WORK] [-I INPUT] [-O OUTPUT] [-m METS]
       [-o OUTPUTMETS] [-g GROUP]

optional arguments:
  -h, --help            show this help message and exit
  -p PARAMETER, --parameter PARAMETER
                        Parameter file location
  -w WORK, --work WORK  Working directory location
  -I INPUT, --Input INPUT
                        Input directory
  -O OUTPUT, --Output OUTPUT
                        output directory
  -m METS, --mets METS  METs input file
  -o OUTPUTMETS, --OutputMets OUTPUTMETS
                        METs output file
  -g GROUP, --group GROUP
                        METs image group id

Compare this with ocrd-tesserocr-binarize:

Usage: ocrd-tesserocr-segment-line [OPTIONS]

Options:
  -V, --version                   Show version
  -l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
                                  Log level
  -J, --dump-json                 Dump tool description as JSON and
  -p, --parameter PATH
  -g, --page-id TEXT              ID(s) of the pages to process
  -O, --output-file-grp TEXT      File group(s) used as output.
  -I, --input-file-grp TEXT       File group(s) used as input.
  -w, --working-dir TEXT          Working Directory
  -m, --mets TEXT                 METS URL to validate
  --help                          Show this message and exit.

The log-level and dump-json options are missing, the long parameter names do not match, an unspecified -o/--OutputMets parameter was added.

n00blet commented 5 years ago

As most of the scripts are currently following OCR-D cli specs, the older cli arguments have been removed.