Open e4m2 opened 1 year ago
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics See info in area-owners.md if you want to be subscribed.
Author: | e4m2 |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Runtime.Intrinsics`, `untriaged` |
Milestone: | - |
public new abstract class VL : Avx512F.VL
{
public static new bool IsSupported { get; }
public static Vector128<byte> Decrypt(Vector128<byte> value, Vector128<byte> roundKey);
public static Vector128<byte> DecryptLast(Vector128<byte> value, Vector128<byte> roundKey);
public static Vector256<byte> Encrypt(Vector256<byte> value, Vector256<byte> roundKey);
public static Vector256<byte> EncryptLast(Vector256<byte> value, Vector256<byte> roundKey);
}
What's the benefit of exposing the EVEX variants separately?
Technically, VAES
and AVX512-F
only indicate 512-bit operation; AVX512-VL
is required to use 128-bit and 256-bit vectors, hence the dedicated subclass. If you're asking why they exist when the VEX forms exist, it's probably just to allow the user to choose which prefix to use, or for consistency.
If you're asking why they exist when the VEX forms exist, it's probably just to allow the user to choose which prefix to use, or for consistency.
Users don't get to pick the prefix, the JIT picks based on the most optimal form. For V512, it's required to use EVEX. For V128/V256 it will pick VEX if only the lower 16 SIMD registers are used. If LSRA must allocate an extended SIMD register (one of the upper 16) or decides that it can take advantage of another EVEX only feature such as embedded broadcast
or embedded masking
, then it may use EVEX instead (assuming the hardware is capable of course).
We intentionally do not duplicate APIs needlessly, and so we shouldn't need them under Avx512Vaes.VL
Given that, given the future for Avx10, and given what we had previously opted for with VPCLMULQDQ
(https://github.com/dotnet/runtime/issues/95772), we should likely name these Aes256
and Aes512
, respectively.
However, depending on how we decide to do Avx10
, it may be "better" to have these in nested V256
/V512
classes under Aes
and Pclmulqdq
instead.
@e4m2, could you update to follow the same general pattern as Pclmulqdq
for now and then I can get this reviewed after or as part of the Avx10
work, at which point we'll know the desired pattern?
Thanks for the input. Updated!
namespace System.Runtime.Intrinsics.X86;
public abstract class Aes
{
public abstract class V256
{
public static new bool IsSupported { get; }
public static Vector256<byte> Decrypt(Vector256<byte> value, Vector256<byte> roundKey);
public static Vector256<byte> DecryptLast(Vector256<byte> value, Vector256<byte> roundKey);
public static Vector256<byte> Encrypt(Vector256<byte> value, Vector256<byte> roundKey);
public static Vector256<byte> EncryptLast(Vector256<byte> value, Vector256<byte> roundKey);
}
public abstract class V512
{
public static Vector512<byte> Decrypt(Vector512<byte> value, Vector512<byte> roundKey);
public static Vector512<byte> DecryptLast(Vector512<byte> value, Vector512<byte> roundKey);
public static Vector512<byte> Encrypt(Vector512<byte> value, Vector512<byte> roundKey);
public static Vector512<byte> EncryptLast(Vector512<byte> value, Vector512<byte> roundKey);
}
}
Background and motivation
On some newer x86 CPUs VAES provides wider variants of encoding/decoding included in the older AES instruction set.
The 256-bit VEX-encoded variant (effectively operating on 2 AES blocks in parallel using a single instruction) has a separate CPUID flag and is not dependent on AVX512 support. Additionally, if AVX512F is supported, a 512-bit EVEX-encoded variant is available. As expected, EVEX-encoded 128 and 256-bit variants are available if AVX512VL is supported.
API Proposal
Note VAES doesn't include round key assist or inverse mix columns instructions.
API Usage
Same as AES intrinsics, except using wider vector types.
Alternative Designs
No response
Risks
No response
References
https://en.wikichip.org/wiki/x86/vaes https://en.wikipedia.org/wiki/AVX-512#VAES https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#othertechs=VAES