Closed ghost closed 3 years ago
Yes, in the interest of execution time we don't complete checksums on files which diverge in the first few kb or so. Could add an option -c json:hash_uniques
as a work-around if you still need this.
Ok @lockywolf , two new options have now been added and merged into the develop branch https://github.com/sahib/rmlint/tree/develop.
With --hash-uniques
, all found files get hashed.
With --hash-unmatched
, only size-twins get hashed. This is more efficient for dupe-finding, because if you only have one file that is 4,635,235,654 bytes long then it can't have any duplicates.
Also with either of these options specified, you no longer need -c json:unique
That's going to help, thank you!
Shall the issue be closed?
Resolved by #479.
I am trying to deduplicate remote disks, in a way that is suggested by issues https://github.com/sahib/rmlint/issues/329 and https://github.com/sahib/rmlint/issues/199
I'm running
time rmlint -g -c json:unique -mkr // /home/
. When I am browsing the resulting json, I see no field for a hash sum. How would rmlint on a different machine, using--replay
, find duplicates?