OCR-D / zenhub

Repo for developing zenhub integration
Apache License 2.0
0 stars 0 forks source link

Remove on demand download of images in processors #58

Open paulpestov opened 2 years ago

paulpestov commented 2 years ago

Current situation Currently for example the tesserocr-recognize processor tries to download images from given URLs in the METS file if they are missing on filesystem. Although this is a neat feature but it may lead to unexpected behaviour in the chain of executions.

How it could be Could we set borders of responsibilities for processors? In the above case it would mean that processors should work with whatever images they find locally (or defined location). The idea would be to make processors more predictable/functional so given input parameters guarantee a straightforward execution with a predictable output.