Closed Afforess closed 6 years ago
@Afforess Thanks for tackling this :+1:
I think the preliminary features are done. There may be some code cleanup/tightening needed, so feel free to call out anything spotted. I didn't try to make the --skip-if-exists
flag use the file size / checksum / anything sane, and just tested for every file in the archive existing. This is not ideal, but the disparity in the output between unrar / 7z / rar may preclude any intensive checks without a rewrite. If checksum integrity is needed/desired, I think the files would have to be extracted each time and compared. I'm not inclined to favour this approach, as my own personal needs have this running as part of a cron, and re-extracting all the files in a very large media directory to compare checksums sounds like a terrible way to use my CPU cycles. Alternatively It may be possible to scrap the file size out of the rar archive without extraction with unrar l
but I am not sure how valuable that is.
On the plus size, unrarall uses temporary directories in the output/cwd directory to store the extracted intermediate results now, which means that interrupted/broken archives should not contaminate the file listing and falsely trick --skip-if-exists
. On the down side, the safe-move feature means that --skip-if-exists
will not work if unrarall is trying to extract from multiple rar files and has to choose alternate names. Nothing short of iterating all the files and comparing checksums would work here.
Any other comments? I'll squash my commits if you prefer this all-in-one.
~I've been using this a bit and I did notice that --skip-if-exists
does not always play nicely with rar's split into parts. I'll try and see if this can be fixed.~
I added a test and it seems to be fine.
I've been using this for a few days, and it seems to work well for the most part. However, I noticed some negative interaction between two new features, --skip-if-exists
and --depth
(recursive extraction of rar's). --skip-if-exists
doesn't correctly detect extracted inner rar archives because of the design. Here's an example workflow where it has issues:
foo/
contains bar.rar
. bar.rar
contains quz.rar
inside, which further contains a file data
. (e.g foo/bar.rar/quz.rar/data
)unrarall --skip-if-exists foo/
is executed (note: --depth 4
is implicitly set)bar.rar
to foo/<tmp dir>
and then executes a child-process unrarall --skip-if-exists --depth 3 foo/<tmp dir>
to extract any inner rar archives.quz.rar
inside foo/<tmp dir>/<tmp dir>
and then detects no inner rar archives, so data
is moved back to the original directory location, foo/<tmp dir>
, and exits the child processdata
back to the original directory location, foo/
and exitsIn step 4, unrarall was run from a child-process invoked from the original script, so it was unaware that foo
and not foo/<tmp dir>
was the location it needed to search for any files that already existed and need not be extracted. As a result it re-extracted data
again. The failure here is graceful, unrarall didn't lose data, didn't delete anything it shouldn't have, instead it is wasting cpu cycles and not properly skipping extraction.
My initial thoughts there are three options for this remaining issue:
--skip-if-exists
so the child process created in step 3 can check to foo/
instead of foo/<tmp dir>
.
Thoughts @delcypher, @arfoll others?
If "Do nothing" is acceptable, this PR can be merged now, as-is. Otherwise changes need to be made to fix the last outstanding issue.
Yawn, well a whole year has practically gone by, and I've been using this for a while. I've not seen any major issues - has anyone else tried out the changes in this PR?
@Afforess sorry - understand the frustration, been busy with other stuff. I've been using this as well (although not really the features since they're not really needed for my usecases...) for a while now and not found any issues, so let me go through it again to jog my memory and I'll happily merge it.
Merged - thanks alot and apoligies again for the delay.
Glad this got merged. I hope my comments weren't overly annoying, it probably could have been toned down.
No worries, bash scripts are frustrating to maintain & patch :)
This PR is intended to implement the features outlined in #37, and when complete should obsolete my previous PR, #30. ~I'm leaving #30 open for now as it is a working implementation, while this is presently a work-in-progress.~ *Not a WIP anymore.
TODO:
The files not being clobbered doesn't happen exactly the way outlined in #37, it uses the
mv -b
featureset, which when the filefoo
exists andmv -b bar foo
is called, will movebar
to locationfoo~
instead. The rename suffix of the backup file can be overridden by environment variables, seeman mv
for details. This approach seems significantly more maintainable than trying to generate a unique suffix and move the file to that location ourselves, whenmv
can handle it for us. The downside is it's not possible to issue a warning for the renamed file becausemv
doesn't tell us about it.mv -v
verbose output shows the backup rename but parsing it seems like a significant burden for such a tiny gain. Thoughts @delcypher, others?I'll hopefully tackle the remaining major features in the next few days.