adrianlopezroche / fdupes

FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
2.48k stars 186 forks source link

Combine redundant code bits that only call stat() #71

Open jbruchon opened 7 years ago

jbruchon commented 7 years ago

Several functions call stat() and only return a single value from the struct stat returned:

filesize()
getdevice()
getinode()
getmtime()
getctime()

There is also a stat() call at line 327 and a function getfilestats() which calls some of the stat() functions mentioned.

The overhead from redundant function calls and system stat() calls is heavy; for the 19 files and dirs in testdir using fdupes -nrq testdir/ results in a total of 163 redundant stat() calls according to an strace log. On a different file tree with 1056 files and dirs, the excess stat() count shoots up to 38559.

I propose combining all of these functions so that each file is stat()ed only one time, with the relevant struct stat items stored all at once.

golimarrrr commented 6 years ago

That should be the reason why it's quite slow when used on files in network shares (tested for example against md5sum)