jvirkki / dupd

CLI utility to find duplicate files
http://www.virkki.com/dupd
GNU General Public License v3.0
113 stars 16 forks source link

Feature request: using Fast Positive Hash as a hash function #16

Open data-man opened 6 years ago

data-man commented 6 years ago

t1ha - one of the fastest hash functions I know. The author of the library has a fork the jdupes. Very fast. It would be great if the dupd would use this library (optionally).

jvirkki commented 6 years ago

Thanks for the pointer, I'll give it a try.

I don't want to make it a hard dependency because I'm trying to keep dupd free of all external dependencies so it is easy for everyone to build. What I could do is make it dlopen() the shared library if available, that way the support is optional.

Note that dupd is not usually (although it can happen) hash-speed limited, so in most cases it probably won't see much difference. But always interesting to try.

jbruchon commented 5 years ago

FWIW I switched from my jodyhash to xxHash64 because it's faster and is a simple .c/.h file pair with an easy C interface.

jvirkki commented 5 years ago

Same, dupd switched to xxHash back in 1.6. Works quite well.