uga-libraries / format-report

Aggregate and analyze csv files with file format information generated by the UGA Libraries' digital preservation system (ARCHive).
Creative Commons Attribution Share Alike 4.0 International
0 stars 0 forks source link

Need no version to match NARA unspecified version? #42

Open amhanson9 opened 1 year ago

amhanson9 commented 1 year ago

In the accessioning script, the match_nara_risk() function includes cleanup to remove all other matches for a file's format identification if it has no version and one of the NARA matches has "unspecified version". This relies on having the file path, which is not present in ARCHive format identifications.

This may no longer be needed, since I think these mostly helped with matches due to file extension, which we also cannot do with ARCHive format identifications. Monitor what manual cleanup is needed before deciding if a different method for this type of cleanup is needed.