qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
332 stars 27 forks source link

Ocrd cli #33

Closed kba closed 3 years ago

kba commented 3 years ago

eynollah itself doesn't work with the page.imageFilename, so I don't see a reason to changing it.

yes it does: you pass it as image_filename kwarg below, which then gets opened with cv2.imread.

oh right, my bad, will change it to pass it the return of download_file.

Since we're encouraging METS-relative paths, there might be mets:files that are URL but I can't think of a scenario where pc:Page/@imageFilename would be a URL (except in our test data).

granted, but then why use mets.find_files(url=page.imageFilename) here at all?

Mostly because I want to make sure that the PAGGE @imageFilename is in the METS. If it isn't this will raise an IterationError.