Open tannergooding opened 7 years ago
This could perhaps be a special value that would otherwise be invalid (such as Alignment=-1). Other special alignments could also be allowed in a similar manner.
Perhaps -4
to mean 4 octets from the end of the cache line?
This probably deserves/requires input from some runtime folks as well, given that they have the best understanding of how determining alignment works today.
@karelz, do you know who should be tagged?
Going to tag @jkotas, @fiigii, and @mellinoe right now.
This will be very useful for ensuring the backing data structures are properly aligned when they are used in combination with the Hardware Intrinsics feature.
There are two aspects of this:
Is this issue about 1, 2 or both?
The second (controlling alignment for the entire type).
If that is provided, the first can be achieved by aligning the whole type and using the appropriate field offset attributes on the individual members.
Although, It does somewhat extend into the range of both when dealing with types that have an alignment but are also members of another type. I mentioned in the original comment some scenarios where this may come up.
FEATURE_STRUCTALIGN
ifdefs.@jkotas, thanks for the reference (going to take a look through this when I have some time)!
I'm guessing the issue isn't in the first allocation, since that can be considered "trivial". That is, you just need to allocate, at most, Size + (Alignment - 1)
bytes and return the first address with the correct alignment.
So, I think the hardest part for the GC probably comes in play when the heap is compressed or when objects are otherwise moved, since alignment limits where it can be moved to. I'm wondering, however, if this can be done without bringing in too much cost.
I would think (possibly naively) that the GC would set a flag indicating whether an object is aligned (or maybe a separate tree containing these objects or something similar). Most objects are not expected to be aligned, so they don't need to do anything else. The few objects that are aligned need to relocated to an address that is still aligned. This can be any address that is between Size
and Size + (Alignment - 1)
bytes in length (where Size
is for an address that is perfectly aligned and Size + (Alignment - 1)
is an address with "worst case" alignment).
It is the 10,000ft view of how this may work. You can tell from the 37 FEATURE_STRUCTALIGN
ifdefs left over in the GC from previous attempt to implement this that it is not exactly trivial to implement. Also, I would expect that the implementation itself is not where most of the work would be - most of the work would be in both functional and performance testing.
I put a bit of thought into why the feature is requested (feel free to correct me if you disagree)...
On modern computers, unaligned reads/writes are (generally speaking) as fast as aligned reads/writes. The exception to this is when the load/store crosses a cache-line boundary (or worse, a page boundary).
Looking at the "Intel Optimization Manual", a load/store that crosses a cache-line boundary can take ~4.5x more cycles on modern CPUs and more on older (this is assuming I didn't miss a section that says something different for even newer processors).
The most commonly used alignments will likely be:
Other alignments (those between cache line size and page size), as far as I can tell, do not provide any real performance benefit. This is because there is no register which can read the data all at once and because it won't provide any additional guarantees of not crossing a cache-line or page boundary.
If getting the GC to support custom aligned types is hard (and not likely to get this feature any time soon), then is there a reasonable workaround for the near or long term?For example:
Marshal
class, some of which are probably fixableOn the other hand, has any consideration been put in to support custom aligned types, but with certain limitations? For example:
custom alignment is supported, but only for arrays.
Yes, arrays are our only use case. Actually, we would not even need all arrays, gaining control on byte[]
arrays only would already be sufficient, thanks to MemoryMarshal.Cast
.
If https://github.com/dotnet/coreclr/issues/19936 is implemented, you'll at least be able to roll your own aligned buffers with the knowledge the GC will never move them.
My interest would be for CMPXCHG16b
with a object reference + tag
type struct
@saucecontrol A memory mapped file will already give you aligned buffers. However, it's an IDisposable
object to deal with. To make aligned memory convenient, we need support from the GC.
The issue I linked is specifically about adding GC support. It doesn't handle the alignment, but it solves the problem of the GC potentially moving something after you've found an aligned section to work with.
There's also https://github.com/dotnet/corefx/issues/31787, which addresses aligned allocation of arrays.
Rationale
In certain high performance or specialized data structures/algorithms, it is desirable to enforce an alignment for structs, fields, or locals.
Today, CoreFX provides several specialized data structures for which the runtime either has special alignment handling (
System.Numerics.Vector
) or for which they have some specialized padding (https://github.com/dotnet/corefx/pull/22724).As such, the framework/runtime should provide a mechanism for encforcing a specified alignment for structs and fields. Locals should also be included if that is feasible (I'm not sure if that is readily possible today given that attributes cannot be specified on locals).
Additional Thoughts
It might be worthwhile to additionally expose this on the existing
StructLayoutAttribute
as anAlignment
property.An alignment of
0
should be treated as "Automatic" (the current behavior of letting the runtime decide alignment).A mechanism for aligning to the cache would be ideal (https://github.com/dotnet/corefx/pull/22724#issuecomment-319075196). This could perhaps be a special value that would otherwise be invalid (such as
Alignment=-1
). Other special alignments could also be allowed in a similar manner.If a field specifies an alignment less than that of the struct, it should be aligned to the alignment of the struct. For example, if you do
Alignment=8
on aVector4
(which has anAlignment=16
), the field should be treated asAlignment=16
.[Design Decision] If a struct specifies an alignment less than that of its first field it should either: A. Align the struct as specified and add the appropriate padding so that the first field is also aligned as specified -or- B. Align the struct as per the requirements of the first field
[EDIT] Make reference to the PR a link by @karelz