sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.87k stars 131 forks source link

Add options to keep/avoid reflink'ed file duplicates #328

Open Fat-Zer opened 5 years ago

Fat-Zer commented 5 years ago

It would be nice to have some options for reflinks to avoid them from listing in output. The options should behaves pretty much the same as options as those hardlinks counterparts (--keep-hardlinked and --no-hardlinked) but rely on reflink --is-reflink logic rather than inode number.

SeeSpotRun commented 5 years ago

Should be possible. I don't have much time at the moment but will try to get back to this in a few weeks.

SeeSpotRun commented 3 years ago

Two-and-a-bit years later...

Feel free to try out https://github.com/SeeSpotRun/rmlint/tree/feature/keep_reflinked

Main additional feature is that existing reflinks (above a configurable size) are identified prior to hashing. This should significantly speed things up if files are already reflinked.

When used in combination with -c sh:clone it will emit a shell script which, for existing reflinks, outputs skip_reflink <orig> <clone> for existing reflinks.