Oxsomi Core3 is a combination of standalone C libraries useful for building applications, such as types, platform, graphics abstraction and file formats
Currently the implementation is a bit split because one uses intrinsics and the other doesn't.
Even though for some functions this makes sense, it would be easier to generalize them in I32x4 to allow a unified architecture there and make that fallback instead.
SHA256 should only use the _mm_sha256 instruction. Remove the following:
[x] _mm_alignr_epi8
[x] _mm_blend_epi16
[X] _mm_set_epi64x
[x] _mm_shuffle_epi8
[x] GHASH, CRC32 and SHA are okay as a decently optimized version is already presented. Changing this to emulating the instructions exactly will make them slower and will bloat the vector class.
[x] Move very specific functions such as aes key assist and enc to AES instead of making it accessible for everything. Otherwise it will bloat vector.
Currently the implementation is a bit split because one uses intrinsics and the other doesn't. Even though for some functions this makes sense, it would be easier to generalize them in I32x4 to allow a unified architecture there and make that fallback instead.