Closed arnetheduck closed 1 year ago
of course, ideal alignment would be done up to 64 bytes but this breaks dynamic allocation which will not let itself be aligned further than 16 (typically).
An aligned allocator (posix8memalign
) can be exposed for ptr UncheckedArray
. And within Nim objects, the compiler will do the right thing.
For seq objects, we can log a feature request to Nim upstream. It would be very helpful in other domains like scientific computing/machine learning/image processing as well.
An aligned allocator (posix8memalign) can be exposed for ptr UncheckedArray.
MDigest
is too much of a general-purpose type for this to make sense - ie for such advanced use cases, the instance itself can be made "more" aligned (rather than the data field inside)
The introduction of alignment in MDigest allows the compiler to choose aligned instructions for copying, zeroing and processing digests resulting in better codegen for platforms with such instructions and performance increases on platforms where unaligned access is heavily penalised.
Here's an
MDigest[256]
copy without alignment:Same, but with alignment:
We can see aligned loads/stores used for both (using gcc / generic x86_64 CPU) - of course, ideal alignment would be done up to 64 bytes but this breaks dynamic allocation which will not let itself be aligned further than 16 (typically).