intel / isa-l_crypto

Other
267 stars 80 forks source link

AVX512: Use embedded broadcast to replicate constants from memory. #106

Closed Shark64 closed 1 year ago

Shark64 commented 1 year ago

With AVX512 we can use the embedded broadcast option to replicate constants from memory. This reduces the .data size quite a lot. All the constants are >=32bit in size, so they're expanded "for free" during load without lengthening the critical path. I've switched to scalar broadcast all the AVX512 routines, plus a couple of minor optimizations (keeping shuffle masks in registers instead of reloading them each time, use VPTERNLOG for 3-way bitwise logic, and in generally trying to use instructions with shorter encoding). Tested only on RocketLake and Linux, passes "make test" and test_checks.sh on my machine.

Shark64 commented 1 year ago

I had forgotten that this pull request was stil open. Sorry. Is there something that i have to change? The tests looks all ok

Shark64 commented 1 year ago

Ping @gbtucker 😉