Engine for analysis of Siegfried export files and DROID CSV. The tool has three purposes, break the export into its components and store them within a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions. The tool will find duplicates, unidentified files, blacklisted objects, character encoding issues, and more.
Some queries optimize the need for string concatenation too early, for example:
"SELECT 'ns:' || NSDATA.NS_NAME || ' ' || IDDATA.ID, count(IDDATA.ID) as TOTAL\n" (ns + Namespace + Name + id + count)
This leaves very few options to manipulate this afterwards, e.g. for handling versions where version is None. It shouldn't be done at the query layer here when we still want to format the results both as data and present them.
Some queries optimize the need for string concatenation too early, for example:
"SELECT 'ns:' || NSDATA.NS_NAME || ' ' || IDDATA.ID, count(IDDATA.ID) as TOTAL\n"
(ns + Namespace + Name + id + count
)This leaves very few options to manipulate this afterwards, e.g. for handling versions where version is
None
. It shouldn't be done at the query layer here when we still want to format the results both as data and present them.