sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.86k stars 128 forks source link

Combining --progress and --replay deletes hashes in rmlint.json #581

Open mb720 opened 1 year ago

mb720 commented 1 year ago

Hi and thanks for rmlint!

I was surprised by the behavior of this command:

rmlint --progress -c sh:link --replay rmlint.json .

I expected that rmlint would reuse the existing rmlint.json and create a rmlint.replay.sh that makes hardlinks out of duplicate files. And report a bit of progress while doing so.

Instead, both rmlint.json and rmlint.sh got overwritten with nearly empty files. The hashes are gone from rmlint.json.

I wasn't sure what was wrong and didn't expect that the innocuous looking --progress option was causing trouble. But it was. Removing --progress made rmlint behave as I wanted:

rmlint -c sh:link --replay rmlint.json .

I somehow suspect this behavior is a bug since rmlint overwrites data without warning and it doesn't seem to be documented in the man page.

I'm using rmlint 2.10.1 on Arch Linux, compile features are according to rmlint --version

compiled with: +mounts +nonstripped +fiemap +sha512 +bigfiles +intl +replay +xattr +btrfs-support

cebtenzzre commented 1 year ago

This isn't explicitly documented, but it is subtly implied by the way --progress is described in the manpage:

Convenience shortcut for -o progressbar -o summary -o sh:rmlint.sh -o json:rmlint.json -VV.

Note the explicit -o sh:rmlint.sh -o json:rmlint.json.

-o clears the default outputs, which are different in replay mode. But --progress still allows you to override the sh/json outputs individually.

I agree that this behavior is confusing, and I am preparing a pull request that simplifies the option.