Open bertsky opened 2 years ago
Also, I think it would be useful to add an option for not moving any local files around at all, including ID changes. (In that case, no references need to be updated. And it is much faster.)
Another option would be to offer just making the new group an alias of the old one (as implemented via XSLT 1.0 in workflow-configuration).
Another option would be to offer just making the new group an alias of the old one (as implemented via XSLT 1.0 in workflow-configuration).
@kba should we make that a separate issue? (Use-cases are aliasing input fileGrp to OCR-D-IMG
for our common workflows, or aliasing output fileGrp FULLTEXT
to ALTO
for myCore.)
Another option would be to offer just making the new group an alias of the old one (as implemented via XSLT 1.0 in workflow-configuration).
Ouch, just noticed that mets-alias-filegrp.xsl
is fundamentally broken, for it is not allowed to reuse the same XML IDs – I would have to rename them in the new fileGrp (and re-reference them in the physical structmap). Since this kind of thing cannot easily be done in XSL (v1.0 anyway), let's please provide that via Python.
The current implementation of
Workspace.rename_file_group
is smart by going after the affected image file references within PAGE files as well:https://github.com/OCR-D/core/blob/71d295ac1fccbeb4164e230bd584e1920b9ab3c8/ocrd/ocrd/workspace.py#L324-L342
It would be even better if ALTO files (i.e.
/alto/Description/sourceImageInformation/fileName
) were updated in a similar fashion.