tevador / RandomX

Proof of work algorithm based on random code execution
BSD 3-Clause "New" or "Revised" License
1.43k stars 306 forks source link

Unoptimized BLAKE2b #60

Open veorq opened 5 years ago

veorq commented 5 years ago

Issue reported in the context of Kudelski Security's audit

The implementation does not leverage vectorized instructions. For example, on platforms supporting AVX2, a reference, portable implemnentations is about 40% slower than an AVX2 implementation, as reported on a Cannonlake microarchitecture benchmark from SUPERCOP.

An AVX2 implementation of BLAKE2b can be found in the SUPERCOP archive as well as in Libsodium. An AVX512-optimized version of BLAKE2s (not BLAKE2b) is used in Wireguard. Similar techniques may be used to optimize BLAKE2b for the AVX512 instruction set.

tevador commented 5 years ago

Yes, we are using the reference implementations for both Blake2 and Argon2 since neither is performance-critical. Supporting optimized implementations may be desirable.

We are currently exploring the use libsodium.