The setup: create 100K files 1024 bytes each. This is 100MB input:
echo "Creating directory structure, will take a minute"
mkdir dd
for d in `seq 1 100`; do
mkdir dd/$d
for f in `seq 1 1000`; do
printf "%*s" 1024 "$f" > dd/$d/$f
done
done
sync
Before the change this input took 40 seconds to process:
$ time ./duperemove -q -rd dd/
...
real 0m39,835s
user 1m54,903s
sys 0m8,922s
After the change we get 2x speedup in performance:
$ time ./duperemove -q -rd dd/
...
real 0m14,616s
user 0m11,942s
sys 0m2,580s
The main overhead was in a single calloc(8MB) call against each small file. The change should decrease this setup overhead when running against small files.
The setup: create 100K files 1024 bytes each. This is 100MB input:
Before the change this input took 40 seconds to process:
After the change we get 2x speedup in performance:
The main overhead was in a single
calloc(8MB)
call against each small file. The change should decrease this setup overhead when running against small files.