dd86k / sha3-d

Pure D implementation of SHA-3 (Keccak-f[1600,24]) + DUB package
https://code.dlang.org/packages/sha3-d
Boost Software License 1.0
10 stars 1 forks source link

Add support for core.simd, intel-intrinsics, or inlined assembly #1

Open dd86k opened 2 years ago

dd86k commented 2 years ago

Having some form of acceleration would benefit everyone, which this module currently lacks.

Options:

  1. core.simd -- Supported everywhere, I think.
  2. intel-intrinsics DUB package -- Somewhat supports all compilers.
  3. Inlined assembly -- If all fails, at least x86 users would benefit. But limited to AVX/AV2 and not SSE* at best (because DMD).

The plan is to try options 1 and 2 and see which yields the best results (through benchmark and godbolt).

NOTE: The reason this wasn't implemented at first is because this module was once a contestant to get into Phobos.

dd86k commented 2 years ago

Or I could do like the std.digest.sha package does and at least support SSSE3 (version USE_SSSE3 is used from D_InlineAsm_X86 and D_InlineAsm_X86_64). At least I already have experience using the inline assembler under DMD, GDC, and LDC.

dd86k commented 2 years ago

I don't really think this needs SIMD this much because when compiled with LDC or GDC, I get similar performance results compared to OpenSSL.

Test env:

Results (input: pv -r /dev/urandom piped):

dd86k commented 2 years ago

New test under Windows (pv from Cygwin, supporting /dev/urandom) evaluates sha3-d at 140 MiB/s and OpenSSL-Win64 3.0.1 at 232 MiB/s so yeah I do see the difference now.

dd86k commented 2 years ago

In any case, a version Sha3dUseSIMD or Sha3dUseIntrinsics should be provided. Selecting it should be manual, feels better this way, to me at least.

dd86k commented 8 months ago

Plan:

Notes: