eprintsug / EPrintsArchivematica

Digital Preservation through EPrints-Archivematica Integration - An EPrints export plugin to Archivematica
6 stars 1 forks source link

amid regex for callback from Archivematica #24

Closed photomedia closed 3 years ago

photomedia commented 3 years ago

We have this for the regex for extracting the amid from the archivematica callback:

$amid =~ s/[^0-9]//g; # digits only

https://github.com/eprintsug/EPrintsArchivematica/blob/master/cgi/archivematica/set_uuid

We need to update that regex slightly.

It works fine with the assumption that the transfer name is the same as amid. So the plugin generates a folder called "1" for example, and that is the name of the transfer in Archivematica as well. However, if there will are subsequent exports of "1" that archivematica picks up, it might call the transfer "1_1" or "12" depending on how many previous versions it has. So I think that to anticipate that, we need to anticipate that the callback might need the regex to extract the digits before the and ignore the version number that follows the _ So for "amid":"1_4", it would extract the amid as "1".

photomedia commented 3 years ago

I have noticed that AM keeps using the folder name (which is the same as the AIMD) consistently as the AIP name. That results in the AIP (e.g.: name "1") stored multiple times in AM storage with the same name, with different UUIDs. So this is actually OK, and what we want. However, sometimes, perhaps if there is already an existing TRANSFER with the same name, it will append "_1" or "_2", etc., to the name. Then the AIP name would end up being versioned, and the AMID regex would fail.

photomedia commented 3 years ago

This is addressed with the following commit: https://github.com/eprintsug/EPrintsArchivematica/commit/df22a09019a44373612efe509f579db2b058ca96