OCR-D / ocrd_anybaseocr

DFKI Layout Detection for OCR-D
Apache License 2.0
47 stars 12 forks source link

ocrd-tool.json validation issues #1

Closed kba closed 5 years ago

kba commented 5 years ago
ocrd ocrd-tool ocrd-tool.json validate
<report valid="false">
  <error>[tools.ocrd-anybaseocr-tiseg] 'input_file_grp' is a required property</error>
  <error>[tools.ocrd-anybaseocr-tiseg] 'output_file_grp' is a required property</error>
  <error>[tools.ocrd-anybaseocr-tiseg.categories.0] 'text non-text segment' is not one of ['Image preprocessing', 'Layout analysis', 'Text recognition and optimization', 'Model training', 'Long-term preservation', 'Quality assurance']</error>
  <error>[tools.ocrd-anybaseocr-tiseg.steps.0] 'text/non-text/segment' is not one of ['preprocessing/characterization', 'preprocessing/optimization', 'preprocessing/optimization/cropping', 'preprocessing/optimization/deskewing', 'preprocessing/optimization/despeckling', 'preprocessing/optimization/dewarping', 'preprocessing/optimization/binarization', 'preprocessing/optimization/grayscale_normalization', 'recognition/text-recognition', 'recognition/font-identification', 'recognition/post-correction', 'layout/segmentation', 'layout/segmentation/region', 'layout/segmentation/line', 'layout/segmentation/word', 'layout/segmentation/classification', 'layout/analysis']</error>
  <error>[tools.ocrd-anybaseocr-textline] 'input_file_grp' is a required property</error>
  <error>[tools.ocrd-anybaseocr-textline] 'output_file_grp' is a required property</error>
  <error>[tools.ocrd-anybaseocr-textline.parameters.usegauss] Additional properties are not allowed ('action' was unexpected)</error>
  <error>[tools.ocrd-anybaseocr-textline.parameters.blackseps] Additional properties are not allowed ('action' was unexpected)</error>
  <error>[tools.ocrd-anybaseocr-textline.categories.0] 'text line segment' is not one of ['Image preprocessing', 'Layout analysis', 'Text recognition and optimization', 'Model training', 'Long-term preservation', 'Quality assurance']</error>
  <error>[tools.ocrd-anybaseocr-textline.steps.0] 'text/line/segment' is not one of ['preprocessing/characterization', 'preprocessing/optimization', 'preprocessing/optimization/cropping', 'preprocessing/optimization/deskewing', 'preprocessing/optimization/despeckling', 'preprocessing/optimization/dewarping', 'preprocessing/optimization/binarization', 'preprocessing/optimization/grayscale_normalization', 'recognition/text-recognition', 'recognition/font-identification', 'recognition/post-correction', 'layout/segmentation', 'layout/segmentation/region', 'layout/segmentation/line', 'layout/segmentation/word', 'layout/segmentation/classification', 'layout/analysis']</error>
</report>
wrznr commented 5 years ago

@kba To which of the two ocrd-tool.json files in the repo are you referring? Why does the file even exist twice?

n00blet commented 5 years ago

@wrznr now we have only one ocrd-tool.json in ocrd_anybaseocr directory. And commits by @kba has been merged.

kba commented 5 years ago

Why does the file even exist twice?

The convention of the ocrd_* repos is to have the json file in the python source dir and symlink it to the repository root. See #7