pauldreik / rdfind

find duplicate files utility
Other
973 stars 79 forks source link

how to change behaviour of deleteduplicates option #140

Open kotsmotritnastul opened 1 year ago

kotsmotritnastul commented 1 year ago

right now it seems like this option deletes like this

file 1.txt will be considered as original files 2.txt 3.txt 10.txt will be considered duplicates and get deleted

how to get rdfind to sort files in opposite order? a.txt should be the duplicate and b.txt should be original. i never wrote code so i cannot understand what to search and replace, so im asking you to point out what should i replace in source code to get the result.

fire-eggs commented 1 year ago

right now it seems like this option deletes like this

That's not quite right - read the section of the man page entitled "Ranking". To attempt to reprise, the "original" is the first instance or the one closest to the root. The -deterministic option impacts this but the description is unclear.

You don't show us your command line, but you might be able to solve your issue by changing the order of arguments. E.g. if your command line is now: rdfind a b then a.txt (in folder a) will be the original, and b.txt (in folder b) the duplicate. By switching the order: rdfind b a then b.txt is the original and a.txt the duplicate. Experiment with the -dryrun to see the impact.

The logic for marking the "original" is found in Rdutil.cc, function Rdutil::markduplicates [lines 378-419]. How the logic would be modified depends on what criteria you want to use to identify the "original" file.