BAM-PFA / pymm

A set of scripts for audiovisual digital preservation tasks
BSD 2-Clause "Simplified" License
5 stars 2 forks source link

test viability of hashdeep_audit() on LTO #5

Closed mcampos-quinn closed 6 years ago

mcampos-quinn commented 6 years ago

Need to make sure that makeMetadata.hashdeep_audit() is able to use os.chdir and stuff to verify a package once it's been written to LTO. And stuff.

kieranjol commented 6 years ago

I'm curious to know how that goes for you - I think I was having multithreaded issues - md5deep would try to hash like 4 files at a time, so it would just start spinning up and down forever and wouldn't really do anything. I don't think specifying singlethreading helped too much either as I think hashdeep might have been reading the files not in alphabetical order, which is usually fine on a spinning disk but not on a tape (assuming the files were wrtitten in alphabetical order). Though a lot of these issues are made a lot worse with image sequences, though they can be a pain with large files too.

On Tue, Apr 3, 2018 at 7:01 PM, Michael Campos-Quinn < notifications@github.com> wrote:

Need to make sure that makeMetadata.hashdeep_audit() is able to use os.chdir and stuff to verify a package once it's been written to LTO. And stuff.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/BAM-PFA/pymm/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/ABEyvt9DYYn77el1G82lcxH23cprKq8Lks5tk7j0gaJpZM4TFhjB .

mcampos-quinn commented 6 years ago

I'm not going to hold my breath... but man it would be convenient if it just worked!

Thanks for sharing those roadblocks you found. I will have a better sense of where to start troubleshooting. Definitely will let you know what I find out, probably late this week.

kieranjol commented 6 years ago

Cool, validating checksums on tape can work great - I think Dave doesn't do it, he pulls the files off, then validates those..

mcampos-quinn commented 6 years ago

In the fragile testing environment I set up this is looking good with some caveats:

Definitely experienced the spin/wait/hang/spin loop mentioned above. Also testing hashdeep directly helped to suss out these potential solutions.

mcampos-quinn commented 6 years ago

Seems to be just fine. Removed the actual writing of the audit file altogether from initial test. Maybe will be added back in later?