sahib / rmlint

Extremely fast tool to remove duplicates and other lint from your filesystem
http://rmlint.rtfd.org
GNU General Public License v3.0
1.86k stars 128 forks source link

paranoid seems faster than default #579

Closed Dexus0 closed 1 year ago

Dexus0 commented 1 year ago

I ran rmlint on 2 separate disks both of which kept showing a consistent speed up with the -p flag.

/home      7.5s => 2.4s  (197.9 GiB, 303'593 files)
/Data-Disk 1.8s => 0.37s (9.2 GiB,   2'281 files)

Data-disk is an old hard disk drive I will note they all use btrfs.

It's rather counter intuitive that using the -p flag seems faster than the default

cebtenzzre commented 1 year ago

Did you clear the disk cache with sync; sudo sysctl -w vm.drop_caches=3 before each invocation? With such short execution times, rmlint is probably CPU-bound, so paranoid mode can easily be faster. On the other hand, it tends to use much more memory. See the benchmarks here.

Dexus0 commented 1 year ago

running with the command rm -I rmlint.* ; sync; sysctl -w vm.drop_caches=3; rmlint [-p] (in a root shell) has made the times noticeably closer.

/home      10.9s => 9.1s
/Data-Disk 3.2s => 3.2s

In Data-Disk's case the results were rather volatile. I noted them done as the same but it did have times where it went lower than -p ever did, it just didn't do it very consistently.

cebtenzzre commented 1 year ago

I don't think there's a problem here. The advantages and disadvantages of paranoid mode are covered well by the documentation, and the difference you've noted is very small with disk cache out of the picture.