exponential-decay / demystify

Engine for analysis of Siegfried export files and DROID CSV. The tool has three purposes, break the export into its components and store them within a SQLite database; create additional columns to augment the output where useful; and query the SQLite database, outputting results in a readable form useful for analysis by researchers and archivists within digital preservation departments in memory institutions. The tool will find duplicates, unidentified files, blacklisted objects, character encoding issues, and more.
http://www.openplanetsfoundation.org/blogs/2014-06-03-analysis-engine-droid-csv-export
zlib License
23 stars 5 forks source link

Using a siegfried export throws an error #103

Closed andreakb closed 1 year ago

andreakb commented 1 year ago

Hello,

When I tried to do get analysis from a Siegfried export, I got the following error:

andrea@debian:~/demystify$ sudo python3 demystify.py --export /media/sf_Andrea/testsfcollection.sf > analysissftry.htm
2023-08-30 09:08:02 INFO: demystify.py:170:analysis_from_csv(): Generating database from input report...
Traceback (most recent call last):
  File "demystify.py", line 14, in <module>
    main()
  File "demystify.py", line 10, in main
    demystify.main()
  File "/home/andrea/demystify/src/demystify/demystify.py", line 256, in main
    args.export, True, denylist, args.rogues, args.heroes
  File "/home/andrea/demystify/src/demystify/demystify.py", line 171, in analysis_from_csv
    database_path = sqlitefid.identify_and_process_input(format_report)
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/sqlitefid.py", line 52, in identify_and_process_input
    return handleSFYAML(export)
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/sqlitefid.py", line 85, in handleSFYAML
    loader.create_sf_database(sfexport, basedb.getcursor())
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/libs/SFLoaderClass.py", line 124, in create_sf_database
    sf.read_sf_yaml(sf_export)
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/libs/SFHandlerClass.py", line 296, in read_sf_yaml
    filedata = self._process_sf(sf, filedata, processed)
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/libs/SFHandlerClass.py", line 266, in _process_sf
    filesec = self.process_file_section(filedata)
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/libs/SFHandlerClass.py", line 224, in process_file_section
    raise err
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/libs/SFHandlerClass.py", line 214, in process_file_section
    id_record.add_field(key, value)  # NOQA
  File "/home/andrea/demystify/src/demystify/sqlitefid/src/sqlitefid/libs/SFHandlerClass.py", line 449, in add_field
    raise IDError("Field '%s' doesn't exist" % field)
src.demystify.sqlitefid.src.sqlitefid.libs.SFHandlerClass.IDError: Field 'class' doesn't exist

Thanks!

kieranjol commented 1 year ago

Hi, I reckon this is a similar issue to this one in Brunnhilde, https://github.com/tw4l/brunnhilde/issues/60 Where Siegfried introduced a new field, called “class” and it tends to break any tools that process the reports.

ross-spencer commented 1 year ago

Where Siegfried introduced a new field, called “class” and it tends to break any tools that process the reports.

Thanks @andreakb and @kieranjol -- yeah, I think that's it. Annoyed with myself as I'm pretty sure I had a fix for it but then left it on another computer and forgot to merge :grimacing: will get this sorted ASAP.

ross-spencer commented 1 year ago

I'm going to close this as fixed @andreakb thanks for logging it.

Updated release here: https://pypi.org/project/demystify-digipres/2.0.0rc4/ and demystify-lite looks like its working okay again here: https://ross-spencer.github.io/demystify-lite/

Let me know if you find any issues. I'm going to open a new issue for reporting on the new information we're adding with the class field in Siegfried.