LibraryOfCongress / bagit-python

Work with BagIt packages from Python.
http://libraryofcongress.github.io/bagit-python
218 stars 85 forks source link

Added Completeness Check, Addresses #87 #93

Closed nkrabben closed 7 years ago

nkrabben commented 7 years ago

This PR addresses https://github.com/LibraryOfCongress/bagit-python/issues/87

It separates the logic for completeness checking and hash validation into separate methods, and adds a CLI argument for checking completeness.

The changes have a larger change on validation behavior. If a bag is incomplete, hashes will not be calculated. This fits with my workflow when I need to fix bags or request retransfers, but I thought the change should be highlighted.

It also introduces some awkwardness in cli arguments. --fast does not include completeness checking. In the future it might be better to have the following arguments: --validate --oxum = oxum check --validate --completeness = completeness check --validate --fast = completeness and oxum check --validate = completeness, oxum, and hash check But I didn't want to change existing behaviors with this PR

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.1%) to 84.691% when pulling a0a6785e41945b8d63b3337a6ffd09a9eb6db07a on nkrabben:master into af21e2aec868b7bd9a8d48779ead6f2d95f5835b on LibraryOfCongress:master.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.1%) to 84.691% when pulling ef95acf397a0499f98a21e8057d1d210d2854c95 on nkrabben:master into af21e2aec868b7bd9a8d48779ead6f2d95f5835b on LibraryOfCongress:master.

nkrabben commented 7 years ago

Coverage decrease is from "file does not exist" exception no longer being triggered during hashing, since validation stops after the bag is found to be incomplete.

coveralls commented 7 years ago

Coverage Status

Coverage decreased (-0.07%) to 84.734% when pulling a07c94e65103e239d0086816f1cd5e1ae344c151 on nkrabben:master into af21e2aec868b7bd9a8d48779ead6f2d95f5835b on LibraryOfCongress:master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.009%) to 84.81% when pulling 9809539552c57814128671d55b48a2236a48b105 on nkrabben:master into af21e2aec868b7bd9a8d48779ead6f2d95f5835b on LibraryOfCongress:master.

nkrabben commented 7 years ago

I removed the file existence check in _calculate_file_hashes() since an incomplete bag would raise an error before _validate_entries() is called.