iotaledger / entangled

enTangle'd is an amalgamation of all things Tangle
Apache License 2.0
113 stars 66 forks source link

Use SIMD for ptrit_t. #1377

Open semenov-vladyslav opened 5 years ago

semenov-vladyslav commented 5 years ago

Is your feature request related to a problem? Please describe. cIRI transaction validation run on the processor, the incoming transaction queue grows and the processor is lagging. Need to speed up validation.

Describe the solution you'd like Validation essentially consist of computation of curl hash values. ptrit_t type allows to run 64 instances of curl simultaneously as it uses uint64_t internally (the curl-p module). We can switch to SIMD extensions (SSE/SSE2, AVX/AVX2, AVX512F) of Intel x86 architecture and use __m128i, __m256i or __m512i instead of uint64_t. It would allow for 2x, 4x or 8x speedup approximately on a single core. ARM processor have 128-bit NEON SIMD extensions.

Ptrits are used only for curl-p (pearl diver and curl itself). So no other code will be broken after these changes.

Describe alternatives you've considered Several thread of curl-p can be run on a multicore processor or on a graphics card. These additional solutions, however, seem to be a bit more complex.

Additional context

semenov-vladyslav commented 5 years ago

There is a project dcurl that does exactly what is needed. Can/should we use it?

jserv commented 5 years ago

Cc. @marktwtn