LibraryOfCongress / bagger

The Bagger application packages data files according to the BagIt specification.
Other
120 stars 19 forks source link

False error message reported after saving zipped bag #58

Closed BluesCruiser closed 6 years ago

BluesCruiser commented 6 years ago

Bagger 2.8.1. Windows 7 Enterprise

bagger

johnscancella commented 6 years ago

Hi @BluesCruiser! Thanks for submitting this bug. I see that you are saving it to the D drive which isn't a normal drive in windows. What kind of storage system is this (hard drive, cd/blue ray, external hard drive, etc)?

Also, the zip functionality has known issues (like not preserving file timestamp among others) and is not recommended. Instead you should create a bag and then use another piece of software to create a zip.

BluesCruiser commented 6 years ago

Hi @johnscancella - many thanks for the quick response. My D: drive is a partition on a hard disk. I wasn't aware that the zip functionality has problems associated with it so this information is extremely useful as we're using zipped bags to add data into our digital preservation system (Archivematica) and I don't believe that our vendor is aware of these issues. If you could elaborate more on them I would be most grateful (or point me to somewhere I can find more information). Thanks for the workaround, I will share it with my colleagues.

johnscancella commented 6 years ago

The timestamp issue is documented here: https://github.com/LibraryOfCongress/bagger/issues/16 This stems from the fact that the original developer wrote a custom zip writer and didn't account for a lot of use cases (code: https://github.com/LibraryOfCongress/bagit-java/blob/4.12/src/main/java/gov/loc/repository/bagit/writer/impl/ZipWriter.java). Having dealt with this code, I would recommend against directly creating a zip bag. I would also warn that if you are creating a bag 1GB or bigger I would use bagit.py command line instead as it is faster and better supports the BagIt specification.

BluesCruiser commented 6 years ago

This is great! Thank you so much. In fact I have been creating very large zipped bags (up to and including one of 123GB) so thanks for the tip on bagit.py!

johnscancella commented 6 years ago

You're welcome. Please spread the word, if at all possible use the command line version instead of Bagger, it will be faster and use less system resources (like Memory, or CPU). Bagit.py is also much better tested and has been updated for the latest specification.