artefactual / archivematica

Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
http://www.archivematica.org
GNU Affero General Public License v3.0
428 stars 103 forks source link

Problem: Verify metadata directory checksums is failing #873

Closed sallain closed 6 years ago

sallain commented 6 years ago

On augustus, I selected a transfer that contains existing checksums (SampleTransfers/DemoTransfer). The transfer fails on the Verify transfer checksums microservice with the following errors:

/usr/lib/archivematica/MCPClient/clientScripts/archivematicaCheckMD5NoGUI.sh: line 41: sha256deep: command not found
/usr/lib/archivematica/MCPClient/clientScripts/archivematicaCheckMD5NoGUI.sh: line 44: sha256deep: command not found

The same error appears for md5 and sha1 checksums:

/usr/lib/archivematica/MCPClient/clientScripts/archivematicaCheckMD5NoGUI.sh: line 41: md5deep: command not found
/usr/lib/archivematica/MCPClient/clientScripts/archivematicaCheckMD5NoGUI.sh: line 44: md5deep: command not found
/usr/lib/archivematica/MCPClient/clientScripts/archivematicaCheckMD5NoGUI.sh: line 41: sha1deep: command not found
/usr/lib/archivematica/MCPClient/clientScripts/archivematicaCheckMD5NoGUI.sh: line 44: sha1deep: command not found
scollazo commented 6 years ago

The hashdespp package in Ubuntu 16.04 doesn't have the /usr/bin/md5deep symlink. After creating them in augustus, the transfer works.

My question is why are we using md5deep, from the hashdeep package, instead of md5sum, the one that almost all systems have installed, and that is provided in the coreutils package. The output from the archivematica used commands is the same (unless for getting the program version)

We can fix it in the docs/packages/ansible scripts by creating symlinks with the expected names, or in the clientScript code. @sevein , what do you think?

sevein commented 6 years ago

I think *deep was used because of the ability to run in recursive mode.

https://github.com/artefactual/archivematica/blob/stable/1.7.x/src/MCPClient/lib/clientScripts/archivematicaCheckMD5NoGUI.sh#L41

scollazo commented 6 years ago

Created the symlinks in the ansible playbooks as a workaround ( https://github.com/artefactual-labs/ansible-archivematica-src/commit/882967dfbd1b7b449ba8c1529320c17f201e10e9 ), but I'm having second thoughts about it, as this change needs to be propagated to debian packages and manual installs too, and also, this can bring unexpected behaviour in future Ubuntu releases.

Using another approach, rebuilt the hashdeep package from Ubuntu 17.10 , that includes the md5deep links, at https://github.com/artefactual-labs/am-packbuild/pull/105 .

Once this is approved I'll upload the package to the artchivematica externals repo, and we can close this bug

mamedin commented 6 years ago

Another option is to change the md5deep, sha265deep and sha1deep calls in the archivematicaCheckMD5NoGUI.sh file to the corresponding hashdeep form:

hashdeep -c md5 hashdeep -c sha256 hashdeep -c sha1

scollazo commented 6 years ago

Sadly, @mamedin idea doesn't work. Hashdeep and md5deep were two different programs, that had been now merged into one.

Depending on how it's invoked, it shows different outputs, and I wasn't able to find any hashdeep command line switch that allows to use the md5deep kind of output:

https://github.com/jessek/hashdeep/blob/master/src/main.cpp#L1262

sallain commented 6 years ago

As this now works on the test servers, I'm going to close the issue - perhaps a different issue should be filed without a 1.7 milestone if there's a plan to change the code.

Thanks!

sevein commented 6 years ago

The issue wasn't reproducible because of that workaround that @scollazo applied on the QA server. I think that we should wait until https://github.com/artefactual-labs/am-packbuild/pull/105 is merged and the package is installed?

kellyannewithane commented 6 years ago

Tested again to double-check with no errors. Closing issue.