OCR-D / zenhub

Repo for developing zenhub integration
Apache License 2.0
0 stars 0 forks source link

How to handle multiple URLs in METs #11

Open krvoigt opened 2 years ago

krvoigt commented 2 years ago

Current situation

When downloading a METS file (== cloning a workspace) and downloading files, the @xlink:href of the mets:file/mets:FLocat of downloaded files is changed from their HTTP URL form to a local filename path relative to the workspace. This makes it easy to process data but very difficult to map those local filenames back to URL for ingestion into production systems.

How it should be

When cloning workspaces and downloading files, the existing mets:FLocat should not be changed but rather a new mets:FLocat with the local filename should be added.

We also need a processor to remove the mets:FLocat for the local filename needs to be removed because the ZVDD METS profile does not allow multiple mets:FLocat.

Steps