brenthuisman / par2deep

Produce, verify and repair par2 files recursively.
GNU Lesser General Public License v3.0
92 stars 8 forks source link

Separate folder for par files #4

Closed th closed 3 years ago

th commented 5 years ago

A feature wish:

I would like a new option specifying a separate folder, such that par2deep stores all par files in it.

The current behaviour clobbers the original folders, which is not practical in some usages.

In this special "target folder", par2deep could create a folder structure mimicking the original folder structure, so that a 1:1 relationship between original file and par files is maintained, but the user doesn't have to see the par files in his daily life.

Thanks for releasing par2deep!

brenthuisman commented 5 years ago

Happy to hear you like the tool!

The primary reason for why currently par2deep keeps the parity files next to the original files is making changing file hierarchy easy. With a separate folder, you'd need to either keep the same hierarchy and thus change it twice, or use some sort of index, which need needs to be computed on each init, which is probably slow for typical par2deep use-cases.

However, I see (my version of) par2 has the basepath option, which seems to be what you're looking for. Could you verify if that does what you want? That's the quickest fix if it works, otherwise I'd have to dig in and move files around manually, do more bookkeeping, which I probably won't have time for anytime soon ;)

brenthuisman commented 4 years ago

-B flag on Ubuntu 18.04's par2.

seanmikhaels commented 4 years ago

A feature wish:

I would like a new option specifying a separate folder, such that par2deep stores all par files in it.

The current behaviour clobbers the original folders, which is not practical in some usages.

In this special "target folder", par2deep could create a folder structure mimicking the original folder structure, so that a 1:1 relationship between original file and par files is maintained, but the user doesn't have to see the par files in his daily life.

Thanks for releasing par2deep!

Would like to see this integrated into par2deep as well. I have a script i created to find all files in a directory, create par2 files, create a recovery directory based off the filename called "filename-recovery", and move the created par2 files into that directory. There are some bugs with long characters and white space but would love this to be added in par2deep the future.

alexdi2 commented 4 years ago

Can I second this request? This utility is so close to solving file integrity for those of us without ZFS etc., but the requirement to store the PAR files in the same directory is absolutely killer. Creating PARs with par2deep and moving them to a mirrored directory tree with a different utility works if the archive is static, but defeats the powerful 'check and repair' functionality you've built into this. Temporarily putting them back before running the repair causes other problems (like befuddling backup programs that monitor file changes). I was not able to determine the correct syntax to employ 'basepath,' if indeed this is applicable-- there's a note in April suggesting the program no longer uses par2.exe.

brenthuisman commented 4 years ago

Does any of you know how to do this with par2, direct on the cmdline? I'm unable to get creation of parity files to work using -B, only verification.

seanmikhaels commented 4 years ago

May need to fork the original hashdeep code, and implement that way. par2 -B won't output the par files to a separate folder afaik. Here's the bash script i created to find all files, create par2 with default redundancy and place them into a separate folder with the -recovery appended to it.

!/bin/bash find . -type f -print0 | while IFS= read -r -d '' file; do mkdir "$file"-recovery/ par2create -n1 "$file".par2 "$file" mv $(dirname "$file")/*.par2 "$file"-recovery/ mv "$file"-recovery/ parfiles/ done

Does any of you know how to do this with par2, direct on the cmdline? I'm unable to get creation of parity files to work using -B, only verification.

brenthuisman commented 4 years ago

Hmm, if par2 does not support this, it will be a bit more involved then I hoped. I initially though -B was all I had to implement (and I have). Forking upstream is a bit too much for me to take on. What I could do (but failed so far) is understand the par2 code enough such that I can feed it bytes rather than filenames, but because of spaghetti nature, I failed on my previous attempt (and chose to simply use the cmdline interface for libpar2, also make it easy to be compatible for people who can't or won't use libpar2).

Moving files around outside of libpar2/par2 is pointless, that still means write access to the filesystem with your data is required (e.g. can't make parity files for a DVD and place them somewhere else). Keeping the data filetree untouched would be the point for adding this option.

Pity, I would like to have this myself as well 🙂 As it stands, I'll not be focusing on this. Maybe when I'm bored I'll pour over par2's guts again. Pull requests of course very welcome!

mbollmann commented 4 years ago

Isn't this here doing what you want? The protected file lives in ./orig while the par files live in ./par:

/tmp/par-test ↯ ll *
orig:
total 604K
-rw-------. 1 bollmann bollmann 602K Aug 26 13:15 test.zip

par:
total 0
/tmp/par-test ↯ par2 c -apar/test -B$(pwd) -r5 -n1 -- orig/test.zip 

Block size: 308
Source file count: 1
Source block count: 2000
Redundancy: 5%
Recovery block count: 100
Recovery file count: 1

Opening: orig/test.zip
Computing Reed Solomon matrix.
Constructing: done.
Wrote 30800 bytes to disk
Writing recovery packets
Writing verification packets
Done
/tmp/par-test ↯ ll *                                                                                                                                                                                                         2s 945ms 
orig:
total 604K
-rw-------. 1 bollmann bollmann 602K Aug 26 13:15 test.zip

par:
total 356K
-rw-------. 1 bollmann bollmann  40K Aug 26 13:16 test.par2
-rw-------. 1 bollmann bollmann 313K Aug 26 13:16 test.vol000+100.par2
brenthuisman commented 4 years ago

Your incantations are new to me! I did not know that I also had to use the -a flag, and I was using -B like a buffoon as follows: -B. for current dir. That doesn't work, but your script does! I think you just solved the issue, but I think it created a new one

FMI: To verify/repair:

This presents another problem however, par2 finds the originals (relative to the base path) by virtue of the .par2 file hardcoding the relative path (relative to -B). This effectively removes the ability to move files around, which is Not Good (tm).

The only solution is then (apart from moving things to a tmp location which will make performance worse than it already is) to dig into libpar2 and make it eat data streams rather than filenames.

mbollmann commented 4 years ago

Your incantations are new to me! I did not know that I also had to use the -a flag, and I was using -B like a buffoon as follows: -B. for current dir.

Don't worry, I stumbled on all of the same things and was about to give up, until for some reason after a good night's sleep my mind told me to try -a and -B together :)

This presents another problem however, par2 finds the originals (relative to the base path) by virtue of the .par2 file hardcoding the relative path (relative to -B). This effectively removes the ability to move files around, which is Not Good (tm).

If the goal is to not have the relative path in the par2 file, you could just change the working directory before invoking par2, I think. Then par/foo/bar/test.par2 would just refer to test and the burden of knowing that it protects foo/bar/test would fall on par2deep, i.e. you could call verify/repair with -B$(pwd)/foo/bar.

brenthuisman commented 4 years ago

I haven't tested it, but it looks like -B is prepended to both the path supplied in -a and any filenames. In which case changing cwd won't help, e.g. if you want this:

/my/photos/vacation2019/photo009.jpg
/my/photos/par/vacation2019/photo009.jpg.par2

-B must be /my/photos/, right? I don't see what the right way of calling par2 would be here.

It's very helpful to have someone else help with this btw, together we'll stumble over the right incantation at least twice as fast as me alone ;)

mbollmann commented 4 years ago

I just tested it, the path after -a can be absolute. So you should be able to do

cd /my/photos/vacation2019/
par2 c -a/my/photos/par/vacation2019/photo009.jpg -B/my/photos/vacation2019 -- photo009.jpg
brenthuisman commented 4 years ago

OK! basepath is then prepended for just the files, not at all for the parity files. OK, now it all makes some kind of sense. OK, when I find some free time I'll have a go at it.

brenthuisman commented 3 years ago

I've discovered it can be simplified a bit: par2 c -s100000 -c100 -B./my/photos/vacation2019 ./parity/my/photos/vacation2019/photo009.jpg.par2 ./my/photos/vacation2019/photo009.jpg Using gopar, the equivalent is par c -s 100000 -c 100 ./parity/my/photos/vacation2019/photo009.jpg.par2 ./my/photos/vacation2019/photo009.jpg

brenthuisman commented 3 years ago

@mbollmann @alexdi2 @seanmikhaels @th : could you test this function in the new par2deep release? It generally works, but occasionally par2 messes with the path anyway (it will detect a move and propose a move, or will create new parity in the wrong directory). I think the .1 release solves this, but extra confirmation wouldn't hurt ;)

alexdi2 commented 3 years ago

Brent, I'm not getting very far in testing. With the latest release and a single test folder, I'm receiving 'createdfiles_err' for parity regardless of whether the 'create in parity folder' option is checked.

On Sat, Jan 2, 2021 at 7:25 AM Brent Huisman notifications@github.com wrote:

@mbollmann https://github.com/mbollmann @alexdi2 https://github.com/alexdi2 @seanmikhaels https://github.com/seanmikhaels @th https://github.com/th : could you test this function in the new par2deep release? It generally works, but occasionally par2 messes with the path anyway (it will detect a move and propose a move, or will create new parity in the wrong directory). I think the .1 release solves this, but extra confirmation wouldn't hurt ;)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brenthuisman/par2deep/issues/4#issuecomment-753467640, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDZ5WUBTC7DQRDN3OCBCUTSX4GDRANCNFSM4HXKS4PQ .

brenthuisman commented 3 years ago

Hmm, that's not good. Do you use an external par2?

alexdi2 commented 3 years ago

I do not, just removed and updated par2deep with Pip.

On Sun, Jan 3, 2021 at 12:13 PM Brent Huisman notifications@github.com wrote:

Hmm, that's not good. Do you use an external par2?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/brenthuisman/par2deep/issues/4#issuecomment-753648090, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACDZ5WRFCGHMCMB54J4SC7LSYCQVFANCNFSM4HXKS4PQ .