sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.86k stars 128 forks source link

RFE: support reflinking directories #618

Open intelfx opened 1 year ago

intelfx commented 1 year ago

existing behavior

Currently, rmlint -T df -Dj -o sh:rmlint.sh -c sh:handler=reflink will not do anything for any duplicate directories:

# mkdir dir1 dir2
# for n in {1..10}; do dd if=/dev/urandom of=dir1/file$n bs=1M count=1024; done
# cp -a --reflink=never dir1 -T dir2
# rmlint -T df -Dj -c sh:handler=reflink dir1 dir2                                                                                                                                                                                                            

# Duplicate Directorie(s):
    ls -la '/home/intelfx/tmp/test/dir1'
    rm -rf '/home/intelfx/tmp/test/dir2'
WARNING: Unexpected return code 3 from rm_util_link_type()

==> Note: Please use the saved script below for removal, not the above output.
==> In total 20 files, whereof 10 are duplicates in 10 groups.
==> This equals 10,00 GB of duplicates which could be removed.
==> 2 other suspicious item(s) found, which may vary in size.
==> Scanning took in total 21,112s.

Wrote a sh file to: /home/intelfx/tmp/test/rmlint.sh
Wrote a json file to: /home/intelfx/tmp/test/rmlint.json
rmlint -T df -Dj -c sh:handler=reflink dir1 dir2  51,30s user 15,04s system 314% cpu 21,121 total

# sed -n '/START OF AUTOGENERATED/,/END OF AUTOGENERATED/p' rmlint.sh
######### START OF AUTOGENERATED OUTPUT #########

original_cmd  '/home/intelfx/tmp/test/dir1' # original directory

######### END OF AUTOGENERATED OUTPUT #########

(Note the "Unexpected return code 3", which stands for RM_LINK_NOT_FILE.)

suggested behavior

In the shell script, reflinks are created with cp -a --reflink=always. There is nothing preventing the same command from being used to reflink-copy an entire directory. It might make sense to support that.