This was discussed earlier in #248 and mostly resolved but I cam across a slight inconsistency which I would like to resolve before implementing a similar interface for --no-reflinks etc (#328).
Firstly the question was asked regarding the usecase of --no-relinks. I didn't respond at the time, but I see the usecase as basically de-cluttering the rmlint output. Consider the testcase:
$ mkdir dir && echo data > dir/same && echo data > dir/file_copy
$ for i in {1..4}; ln dir/file dir/link$i
The default output of rmlint is:
$ rmlint dir
# Duplicate(s):
ls '<pwd>/dir/file'
rm '<pwd>/dir/link1'
rm '<pwd>/dir/link2'
rm '<pwd>/dir/link3'
rm '<pwd>/dir/link4'
rm '<pwd>/dir/same'
Since hardlinks don't take up space (other than dir and inode entries) we have --keep-hardlinked:
$ rmlint dir --keep-hardlinked
# Duplicate(s):
ls '<pwd>/dir/file'
ls '<pwd>/dir/link1'
ls '<pwd>/dir/link2'
ls '<pwd>/dir/link3'
ls '<pwd>/dir/link4'
rm '<pwd>/dir/copy'
But thats a lot of output, so we have --no-hardlinked:
$ rmlint dir --no-hardlinked
# Duplicate(s):
ls '<pwd>/dir/file'
rm '<pwd>/dir/copy'
It's the same file deletions as --keep-hardlinked but with more concise output.
So far so good.
But...
$ # add some hardlinks of "same":
$ for i in {1..2}; ln dir/same dir/same_link$i
Also --keep-hardlinked looks ok to me, it preserves any hardlinks of the original and deletes everything else:
$ rmlint dir --keep-hardlinked
# Duplicate(s):
ls '<pwd>/dir/file'
ls '<pwd>/dir/link1'
ls '<pwd>/dir/link2'
ls '<pwd>/dir/link3'
ls '<pwd>/dir/link4'
rm '<pwd>/dir/same'
rm '<pwd>/dir/same_link1'
rm '<pwd>/dir/same_link2'
But -no-hardlinked gives this:
$ rmlint dir --no-hardlinked
# Duplicate(s):
ls '<pwd>/dir/file'
rm '<pwd>/dir/same'
And if we run the shell script to delete the dupes, we get left with:
$ ls dir
file link1 link2 link3 link4 same_link1 same_link2
$ rmlint dir --no-hardlinked
# Duplicate(s):
ls '<pwd>/dir/file'
rm '<pwd>/dir/same_link1'
It would take 3 successive runs to actually free up any space.
So I'm going to suggest desired behaviour for --no-hardlinked in this case should be:
$ rmlint dir --no-hardlinked
# Duplicate(s):
ls '<pwd>/dir/file'
rm '<pwd>/dir/same'
rm '<pwd>/dir/same_link1'
rm '<pwd>/dir/same_link2'
So essentially --no-hardlinked is exactly the same behaviour as --keep-hardlinked except that it doesn't print out the hardlinks that are being kept.
Will keep this issue open for a couple of weeks for comment / input. If nothing heard then I'll go ahead as per above.
This was discussed earlier in #248 and mostly resolved but I cam across a slight inconsistency which I would like to resolve before implementing a similar interface for
--no-reflinks
etc (#328).Firstly the question was asked regarding the usecase of
--no-relinks
. I didn't respond at the time, but I see the usecase as basically de-cluttering thermlint
output. Consider the testcase:The default output of
rmlint
is:Since hardlinks don't take up space (other than dir and inode entries) we have
--keep-hardlinked
:But thats a lot of output, so we have
--no-hardlinked
:It's the same file deletions as
--keep-hardlinked
but with more concise output. So far so good.But...
Default behaviour is as expected:
Also
--keep-hardlinked
looks ok to me, it preserves any hardlinks of the original and deletes everything else:But
-no-hardlinked
gives this:And if we run the shell script to delete the dupes, we get left with:
It would take 3 successive runs to actually free up any space.
So I'm going to suggest desired behaviour for
--no-hardlinked
in this case should be:So essentially
--no-hardlinked
is exactly the same behaviour as--keep-hardlinked
except that it doesn't print out the hardlinks that are being kept.Will keep this issue open for a couple of weeks for comment / input. If nothing heard then I'll go ahead as per above.