usnistgov / oar-pdr

The NIST Open Access to Research (OAR) Public Data Repository (PDR) system software
11 stars 10 forks source link

Preservation bug fix: properly handle rapid updates #96

Closed RayPlante closed 5 years ago

RayPlante commented 5 years ago

Two behaviors of the preservation service sets up a race condition that can make an update to an AIP cause data to be lost:

If the update comes before the previously saved bags have migrated to S3, then the preservation will either fail to pull over a head bag, or pull over the wrong one (i.e. not the latest). This means that the data (or updated metadata) from that bag will not be represented in the update and will be effectively lost. When this has occurred in the past, all of the data was lost. Further, the output version and sequence number would be wrong, and the new output bags would overwrite previously saved ones.

(In actuality, the data has been recoverable, thanks to versioning being turned on in the S3 bucket.)

This PR addresses this bug with three major changes:

RayPlante commented 5 years ago

Self-tested under oar-docker demo:

  1. replicated error with previous release of oar-pdr by removing files from the distributions service's data directory
  2. with this PR branch deployed, ran under same conditions to see error avoided

Will merge for further testing on testdata.