Open LynnMcRae opened 6 years ago
Fixity is called by three different processes, under three different conditions.
When we extend this code to include Ingest/Deposit work, Fixity may be called by that process to verify that the incoming content is valid.
Audit service consumer: checks OIS for Moabs, and runs fixity checks on those Moabs
"runs fixity checks"
When an "archive object'** is bagged/zipped, it is checksummed; the checksums are generated and need to be put in TCR.
(A later fixity check against achived objects will regenerate checksum and compare to TCR)
**"archived object" will be "containerized" (e.g. tarred, gzipped, compacted) and there will be a checksum in the TCR for the actual "archived object" as well as for the goodies inside it.
Thus: for a single id
our fixtures: exploded moab object
Moab is NOT containerized.
Every version directory should have a manifest directory; one of the files in here will have checksums (manifest_inventory.xml
)
manifest_inventory.xml
contains all the checksums for all the other files in the manifest directory for that version. (5-6 diff files).
signatureCatalog.xml
- contains the checksums for all the files ...... there is info somewhere about where all the checksums live in the Moab object ...
In order to perform fixity, we need to compute checksums on all files in the Moab object and compare the computed checksums against stored checksums.
Work chunks:
Fixity Checking:
Can temporarily punt on "where is the fixity info" by loading fixture data into test db, or into spec files, or ...
Can initially implement fixity as:
Question: should fixity "gem" actually be code in Moab gem?
Question: is there already a gem that does the fixity checks that we need?
We will need to parse XML files from Moab object in order to get out checksum information. Perhaps the moab object traversal and pulling checksums out of its xml files should be part of moab gem.
How much fixity checking do we actually need? How much for Moab object vs. archive object?
Can we trust the internal fixity of a Moab object? that is, if the moab object's individual file fixity is good ... can we just trust the overall checksum of the whole object without having to further go after individual files for verification?
For online moab - initial
manifest_inventory.xml
onlyFor archived moab - initial
How is Moab generating checksums now? Perhaps that code is relevant.
Should it be split out so archive object fixity checks can use it? Is it already split out?
This is the second level of audit, an fixity check against storage copes, comparing their checksums agains those in the Trusted Checksum Repository.
Audits will need to report findings
Negative findings will need to trigger a recovery process.