syedsaqibbukhari / docanalysis

Apache License 2.0
10 stars 5 forks source link

ocrd-tool: in/output groups, steps, fix #40 #41

Closed kba closed 5 years ago

mjenckel commented 5 years ago

thank you very much for the quick fix!

n00blet commented 5 years ago

@kba I need a little clarification. Is the _input_filegrp for all the preprocessing steps the same ?? In our case "OCR-D-IMG". Because earlier, the input file group for the next step was the output of previous step. For example, the input file group for deskewing was "OCR-D-IMG-BIN"

kba commented 5 years ago

For example, the input file group for deskewing was "OCR-D-IMG-BIN"

I had not really considered the order much, this was just a proposal. input_file_grp is the default, it can (and will in practice) be overridden at runtime. c.f. https://ocr-d.github.io/ocrd_tool#input--output-file-groups

For example, the input file group for deskewing was "OCR-D-IMG-BIN"

Then it would indeed be best to set input_file_grp for deskewing to OCR-D-IMG-BIN which would also serve as documentation to users that binarized images are the expected input.