Engine for analysis of Siegfried export files and DROID CSV. The tool has three purposes, break the export into its components and store them within a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions. The tool will find duplicates, unidentified files, blacklisted objects, character encoding issues, and more.
After nearly two-years of chipping away, the Python 3 compatibility
release is near. This commit represents the entirity of that work,
for this module anyway. Pathlesstaken and sqlitefid have also both
been improved with added testing and handling of unicode.
Demystify's changes seem few, but it required a lot of code
tweaking. Each change can be categorized as one of the below:
Python 3 updates, e.g. library import skew, API changes.
Test harness, e.g. actual testing, and runners.
Linting and code-style changes.
Unicode handling improvements.
More idiomatic Python.
Changes have been written across the code-base, but there is still
more to do and there will be more information about that soon.
The biggest changes are still required in the output modules but
it is hoped the experience, especially if rewriting an output such
as HTML will be that much more friendly to coders with the analysis
script doing most of the heay lifting and with fairly extensive
testing behind it now. The accuracy of any output should be there,
the only need now is to improve output.
With the exception of bug-fixes for PY2 it is hoped future changes
will now be on a Python 3 branch.
After nearly two-years of chipping away, the Python 3 compatibility release is near. This commit represents the entirity of that work, for this module anyway. Pathlesstaken and sqlitefid have also both been improved with added testing and handling of unicode.
Demystify's changes seem few, but it required a lot of code tweaking. Each change can be categorized as one of the below:
Changes have been written across the code-base, but there is still more to do and there will be more information about that soon.
The biggest changes are still required in the output modules but it is hoped the experience, especially if rewriting an output such as HTML will be that much more friendly to coders with the analysis script doing most of the heay lifting and with fairly extensive testing behind it now. The accuracy of any output should be there, the only need now is to improve output.
With the exception of bug-fixes for PY2 it is hoped future changes will now be on a Python 3 branch.
Connected to #42 Connected to #49