sul-dlss / preservation2017

Story repo for preservation core work done summer/fall 2017
0 stars 0 forks source link

Story: Create an audit process for Archived moabs in replication endpoints #19

Open LynnMcRae opened 7 years ago

LynnMcRae commented 7 years ago

This is the first level of audit, an inventory audit which just verifies consistency between the PC Catalog, and Archive data endpoints, that all copies that should be there are there.

Audits will need to report findings

And negative findings will need to trigger a recovery process.

julianmorley commented 7 years ago

The audit process checks the PCC to see if it has a record of the expected archive copy of a moab being found in the expected replication endpoint. For example, if the online moab has a druid of bb048rn5648, we might expect the archive copy of version 7 of that moab to be called bb048rn5648.v0007.zip - so we're looking for the presence of a file with that name in the replication endpoint.

If a record of this expected archive copy is not found in the PCC, BUT an online copy is found, then the audit process invokes/messages/requests that the archive process take a look - it's possible that an archive needs to be created and sent to the cloud endpoint, or that an archive does exist and the issue is merely that the PCC's record of that copy has expired and it's time for a fresh API call to the cloud to verify that the object still exists.

julianmorley commented 7 years ago

Here's a slightly simplified view of moab object lifecycle, showing the relationship between the online moab object store, the inventory process, the PCC, the audit process, recovery process and archive process. Preservation.Core.Simple.Object.Lifecycle.pdf ( Updated to have the data flows pointing in the correct direction )

julianmorley commented 7 years ago

And to re-iterate Lynn's point up top, this is a first level audit: "Is it there?", not "Is it correct?".

Actually running fixity checks against replication endpoints (esp. AWS and OSCA deferred access stuff) will require us to run instances in the cloud, so we can recover from tape to cloud storage, run fixity and report back without incurring data egress charges. This is out of scope for this sprint work.