benalexau / hash-bench

Java Hashing, CRC and Checksum Benchmark (JMH)
MIT License
68 stars 11 forks source link

Extend with checksum algorithms #1

Closed leventov closed 8 years ago

leventov commented 8 years ago

The fact that the fastest hash function (xx64) even slightly outperforms crc32 raises the question: what is the fastest checksum algorithm available in Java? Excluding unreliable "algorithms" like plain xor-ing all words of the data.

I haven't explored this field deeply, but probably this page: https://en.wikipedia.org/wiki/List_of_hash_functions will give some options that might be faster than xx64.

benalexau commented 8 years ago

http://www.jonelo.de/java/jacksum/ might be useful for this.

benalexau commented 8 years ago

I've added Jacksum and the benchmarks are running. With all the extra algorithms it's going to take roughly 27 hours to complete. I'll publish a new report when it's finished.

benalexau commented 8 years ago

I've added report #3 which includes most of the checksums and CRCs listed on the Wikipedia page.

There are now 92 separate algorithms tested by Hash-Bench:

adler32-jacksum-alt
adler32-jacksum-delegating
adler32-jre
city64-zah
cksum-jacksum
crc16-jacksum
crc24-jacksum
crc32-guava-delegating
crc32-guava-rfc3720
crc32-jacksum-bzip2
crc32-jacksum-delegating
crc32-jacksum-fcs32
crc32-jacksum-mpeg2
crc32-jre
crc64-jacksum
crc8-jacksum
ed2k-jacksum
elf-jacksum
farmna-zah
farmuo-zah
fcs16-jacksum
gfh32-guava
gfh64-guava
has160-jacksum
haval128h3-jacksum
haval128h4-jacksum
haval128h5-jacksum
haval160h3-jacksum
haval160h4-jacksum
haval160h5-jacksum
haval192h3-jacksum
haval192h4-jacksum
haval192h5-jacksum
haval224h3-jacksum
haval224h4-jacksum
haval224h5-jacksum
haval256h3-jacksum
haval256h4-jacksum
haval256h5-jacksum
md2-jacksum
md4-jacksum
md5-guava
md5-jacksum
md5-jacksum-alt
murmur3h128-guava
murmur3h128-zah
murmur3h32-guava
ripemd128-jacksum
ripemd160-jacksum
ripemd256-jacksum
ripemd320-jacksum
sha0-jacksum
sha1-guava
sha1-jacksum
sha1-jacksum-alt
sha224-jacksum
sha256-guava
sha256-jacksum
sha256-jacksum-alt
sha384-guava
sha384-jacksum
sha384-jacksum-alt
sha512-guava
sha512-jacksum
sha512-jacksum-alt
sip-fwdeng
sip-guava
sip-inline
sum16-jacksum
sum24-jacksum
sum32-jacksum
sum8-jacksum
sumbsd-jacksum
sumsysv-jacksum
tiger128-jacksum
tiger160-jacksum
tiger2-jacksum
tiger2-jacksum-tree
tiger-jacksum
tiger-jacksum-tree
whirlpool0-jacksum
whirlpool1-jacksum
whirlpool2-jacksum
xor8-jacksum
xxh32-jpountz-jni
xxh32-jpountz-safe
xxh32-jpountz-unsafe
xxh64-jpountz-jni
xxh64-jpountz-safe
xxh64-jpountz-unsafe
xxh64-zah

Given we've got nearly 100 there, I think it's reasonably indicative of the different hash, CRC and checksum latencies on the JVM. If you'd like any extras please open a ticket with a link to the desired Java implementation(s) and I will be happy to add them.

leventov commented 8 years ago

This is an incredible undertaking.

Check, there are some issues with table formatting on results page. (Some cells are empty and number in the columns with algo names).

FYI I've just asked Johann N. Löfflmann (the Jacksum author) to host his library on Github and/or publish it on Maven central.

benalexau commented 8 years ago

I also emailed Johann (as well as Adrien Grand @jpountz from lz4 for Java) so they know about the benchmark and may have any improvement ideas.

Re the table formatting, there was a bug which I corrected locally and re-pushed. Can you please refresh the results page and see if the issue remains? If so, please let me know which table you're looking at.

leventov commented 8 years ago

Yes, now it looks good. Thanks!

benalexau commented 8 years ago

I've just added support for Bouncy Castle, taking Hash-Bench to around 113 tested algorithms. I'll generate a new report overnight and publish it tomorrow.