bertsky / workflow-configuration

a makefilization for OCR-D workflows, with configuration examples
Apache License 2.0
9 stars 4 forks source link

Download images first #5

Closed wrznr closed 4 years ago

wrznr commented 4 years ago

When using workflow-configuration, it is mandatory that the image files are physically present. I.e. you can not make use of ocrd's ability to download images ad-hoc given the corresponding entries in a METS file group. Downloading is best done via

$ ocrd workspace find -G USE --download

Where USE corresponds to the attribute of the file group you want to use as input. Maybe this should be added to the documentation?

bertsky commented 4 years ago

Absolutely! This should be the first step for every workflow not starting with ocrd-import. I'll see to it this will also be prominent for custom/new configurations.

bertsky commented 4 years ago

On the other hand, there is already gt.mk, which does exactly this for all known GT file groups (including OCR-D-IMG). So you could always do make -f gt.mk before anything else. But that's a different strategy, of course...

bertsky commented 4 years ago

Anyway, fixed via 1dfe678.