pauldreik / rdfind

find duplicate files utility
Other
980 stars 79 forks source link

rdfind misses existing hard links (need to run it twice) #109

Open patmaddox opened 2 years ago

patmaddox commented 2 years ago

I have a file that's been linked, and then I create a copy. rdfind breaks the existing links, and links the original to the copy – but the existing link now becomes a duplicate. So I have to run rdfind twice to have it catch all the hard links.

Here's a script that demonstrates the behavior. "dir size after rdfind #1" should be 1.0 MB but instead it's 2.0 MB. It doesn't de-duplicate until I run it a second time, and it spots all three files that should link to the same inode.

#!/bin/sh

mkdir _test
mkdir _test/dir1 _test/dir2
dd if=/dev/urandom of=_test/dir1/file bs=1m count=1
ln _test/dir1/file _test/dir2/file
cp _test/dir1/file _test/file

echo "initial dir size"
du -hs _test

rdfind -makehardlinks true -makeresultsfile false _test
echo "dir size after rdfind #1"
du -hs _test

rdfind -makehardlinks true -makeresultsfile false _test
echo "dir size after rdfind #2"
du -hs _test