DLTcollab / sse2neon

A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
MIT License
1.3k stars 208 forks source link

Absent `_mm_aesdec_si128` and `_mm_aesdeclast_si128` #477

Closed norwend closed 1 year ago

norwend commented 3 years ago

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_aesdec_si128&expand=262 https://developer.arm.com/architectures/instruction-sets/intrinsics/vaesdq_u8

jserv commented 3 years ago

Check aes-brute-force/src/aes_ni_botan.cpp at first glance.

jserv commented 3 years ago

Another AES-Crypto mapping example: https://gist.github.com/mmozeiko/f9c999dda7dbb03722409854a1c39cc2

jserv commented 3 years ago

IIRC, @wangxiao1254 implemented _mm_aesimc_si128, _mm_aesdec_si128, and _mm_aesdeclast_si128 with ARMv8 Cryptography Extensions. See https://github.com/f1ed/emp/blob/master/emp-tool/utils/block.h

However, for SSE2NEON, we need ARMv7/non-crypto-ext counterparts.

jserv commented 2 years ago

Drop-in implementations with ARMv8 Cryptography Extensions:

__m128i _mm_aesdec_si128 (__m128i a, __m128i RoundKey) {
    return vaesimcq_u8(vaesdq_u8(a, (__m128i){})) ^ RoundKey;
}
__m128i _mm_aesdeclast_si128 (__m128i a, __m128i RoundKey) {
    return vaesdq_u8(a, (__m128i){}) ^ RoundKey;
}
__m128i _mm_aesimc_si128 (__m128i a) {
    return vaesimcq_u8(a);
}
jserv commented 1 year ago

However, for SSE2NEON, we need ARMv7/non-crypto-ext counterparts.

The portable implementations:

jserv commented 1 year ago

_mm_aesdec_si128 is implemented in commit 0f28c2539e950cd8dde2f042c39b87cd61c82937 .