sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.87k stars 130 forks source link

Clone hardlinks #467

Closed SeeSpotRun closed 3 years ago

SeeSpotRun commented 3 years ago

Fixes #466.

This PR enables hardlink-clones to be converted to reflink-clones via a single command rmlint --dedupe <original> <hardlink>, while preserving file permissions and times.

Note that converting hardlink to reflink doesn't free any space, in fact it consumes a little bit for the necessary metadata. The advantage is that it de-links the two files' metadata, so that subsequent changes to permissions, access times etc are independent for the two files. Also modifications such as appending data to the original won't affect the reflink the way it would the hardlink.

Note that this implementation does temporarily delete the hardlink, so an untimely crash or error could see the hardlink lost (but not the original file). The workaround would require generating a unique temporary file name for the reflink.

Happy to discuss.

SeeSpotRun commented 3 years ago

I haven't gone any further with this because it can't be done atomically so there is some risk that the hardlink gets deleted or renamed. I could do a bit more work to make it more robust but will wait to see if there is any more interest in #466

SeeSpotRun commented 3 years ago

Ok managed to make it reasonably atomic by cloning to a tempfile in the target dir and then atomically renaming that over the top of the hardlink.