digital-preservation / droid

DROID (Digital Record and Object Identification)
BSD 3-Clause "New" or "Revised" License
285 stars 75 forks source link

Can you guys make it so that it renames the files it scans? #182

Closed jpcreeper13 closed 7 years ago

Dclipsham commented 7 years ago

Hi there, On the surface this sounds like undesirable behaviour for a tool primarily used for digital preservation. Can you please describe the use-case? David

jpcreeper13 commented 7 years ago

I'm using it to identify "extracted.x" files from a warc file, and I want the program to be able to change the files of the extracted files based on the identification

jpcreeper13 commented 7 years ago

I never got a reply

paulyoung84 commented 7 years ago

Hi, I am sorry for the late reply. Unfortunately this is not something which would be in scope for DROID. As a digital preservation tool it is designed to scan the files without causing any changes to them.

jpcreeper13 commented 7 years ago

I’m trying to get my files back

anjackson commented 7 years ago

@jpcreeper13 Are you wanting to get lots of files back, or is it a small enough number that you could do them individually?

If the latter, you could use WebRecorder Player or if you are on macOS The Unarchiver supports WARCs.

EDIT: in case it helps: http://qanda.digipres.org/610/how-to-open-warc-files

EDIT AGAIN: Sorry I've assumed you're dealing with WARCs (because I thought I recognised the extracted.x thing) and I think I was mistaken. Sorry. I'll ask around.

jpcreeper13 commented 7 years ago

It’s a 56 GB archive with around 55,000 files

anjackson commented 7 years ago

What format is the archive? Is it WARCs? Or was that just a wild guess too far?

jpcreeper13 commented 7 years ago

It is a WARC. I’m looking for video files

anjackson commented 7 years ago

The best option I know of is warcat, but only if you are happy running tasks from the command line. Once installed, you should be able to use it like this:

$ python3 -m warcat extract megawarc.warc.gz --output-dir ./extracted/megawarc/ --progress

If you need a GUI then The Unarchiver is the only one I know of, and that's macOS-only.