Open dd86k opened 2 years ago
Or I could do like the std.digest.sha package does and at least support SSSE3 (version USE_SSSE3
is used from D_InlineAsm_X86
and D_InlineAsm_X86_64
). At least I already have experience using the inline assembler under DMD, GDC, and LDC.
I don't really think this needs SIMD this much because when compiled with LDC or GDC, I get similar performance results compared to OpenSSL.
Test env:
release-nobounds
.Results (input: pv -r /dev/urandom
piped):
openssl dgst -sha3-256
: 111-127 MiB/sddh sha3-256
with dmd 2.098.1: 32-34 MiB/s (worse than 2.090.1 which is more around the 42 MiB/s mark)ddh sha3-256
with gdc 10.3: 90-91 MiB/s (worse since I upgraded dmd?!)ddh sha3-256
with ldc 1.20.1: 119-124 MiB/sNew test under Windows (pv from Cygwin, supporting /dev/urandom) evaluates sha3-d at 140 MiB/s and OpenSSL-Win64 3.0.1 at 232 MiB/s so yeah I do see the difference now.
In any case, a version Sha3dUseSIMD
or Sha3dUseIntrinsics
should be provided. Selecting it should be manual, feels better this way, to me at least.
Plan:
version (D_AVX)
: Use AVX impl.version (D_AVX2)
: Use AVX2 impl.Notes:
D_SIMD
.
compiles
traitcore.simd
will adapt to using SSE or AVX somehowAVX
and AVX2
by default.
-mcpu=avx2
-mattr=+avx2
Having some form of acceleration would benefit everyone, which this module currently lacks.
Options:
The plan is to try options 1 and 2 and see which yields the best results (through benchmark and godbolt).
NOTE: The reason this wasn't implemented at first is because this module was once a contestant to get into Phobos.