dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.47k stars 4.76k forks source link

Struct field affects size of derived class #109680

Open timcassell opened 3 weeks ago

timcassell commented 3 weeks ago

Description

Adding a struct field to a base class makes the derived class larger than expected.

Reproduction Steps

[Benchmark]
public MyClass1<byte> MyClass1() => new MyClass1<byte>();

[Benchmark]
public MyClass2<byte> MyClass2() => new MyClass2<byte>();
public class MyBaseClass1
{
    private struct ObjectWrapper
    {
        internal object _obj;
    }

    private ObjectWrapper _objectWrapper;

    private short _id;
    private bool _flag1;
    private bool _flag2;
    private bool _flag3;
    private bool _flag4;
}

public class MyBaseClass2
{
    internal object _obj;

    private short _id;
    private bool _flag1;
    private bool _flag2;
    private bool _flag3;
    private bool _flag4;
}

public class MyClass1<T> : MyBaseClass1
{
    private T _value;
}

public class MyClass2<T> : MyBaseClass2
{
    private T _value;
}

Expected behavior

The type size should be the same whether the field is a reference type, or a struct wrapper around a reference type.

Actual behavior

Method Allocated
MyClass1 40 B
MyClass2 32 B

Regression?

No, the behavior is the same on .Net Framework

Known Workarounds

Don't use struct fields.

Add an additional base type above that type to contain the struct field.

Configuration

BenchmarkDotNet v0.14.1-develop (2024-11-09), Windows 10 (10.0.19045.5011/22H2/2022Update)
AMD Phenom(tm) II X6 1055T Processor 2.80GHz, 1 CPU, 6 logical and 6 physical cores
.NET SDK 9.0.100-rc.2.24474.11
  [Host]   : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT SSE3
  ShortRun : .NET 8.0.10 (8.0.1024.46610), X64 RyuJIT SSE3

Other information

Possibly related to #109585.

Also possibly related to https://stackoverflow.com/questions/67068942/c-sharp-why-do-class-fields-of-struct-types-take-up-more-space-than-the-size-of and https://stackoverflow.com/questions/24742325/why-does-struct-alignment-depend-on-whether-a-field-type-is-primitive-or-user-de.

MichalPetryka commented 3 weeks ago

Seems like the VM is moving the struct field to the end of the object for some reason?

huoyaoyuan commented 2 weeks ago

A question: is Auto layout expected to be observable in non-diagnostic usages? Or is it just an implementation detail?

AaronRobinsonMSFT commented 2 weeks ago

A question: is Auto layout expected to be observable in non-diagnostic usages? Or is it just an implementation detail?

I'm not entirely sure what this means. The LayoutKind.Auto represents the algorithm that is used by the runtime to layout the type in the most efficient way it sees fit. Changing the auto layout algorithm isn't necessarily a task the runtime is likely to attempt unless there is a clear benefit. In this case if LayoutKind.Auto is choosing that size, that is the correct size as determined by the runtime. If someone wants more control over the size of a type a value type should be used marking it with LayoutKind.Sequential.

Note this doesn't mean that in the future changes aren't possible. It does mean that the LayoutKind.Auto should be treated as an implementation detail and as the name "Auto" implies it is left as an exercise for the runtime to lay the type out in the manner best suited for performance with the VM and JIT.

kg commented 2 weeks ago

From experimenting, StructLayout.Sequential doesn't impact this, the struct will still get reordered to the end of the class by the runtime (IIRC it has the right to do this; the sequential-ness is only guaranteed at marshaling boundaries? I may be misremembering but this is how it worked on netfx too.)

If you add an extra uint16 to ObjectWrapper it becomes clearer what might be happening - ObjectWrapper will get rounded up to 2x the size of an object reference, so it gets padding at the end of it before _value appears, and then you potentially have padding after _value.

I think the calculation that determines alignment/packing here is not 'seeing through' the struct as desired. It would be nice if it did but I'm not sure you could actually generalize a rule like that without having nasty consequences in cases where you're passing refs or pointers around, the size of a struct within a containing type probably needs to always be the same as its size on the heap. Maybe something could be done for the specific case of a struct with a single member where layout decomposes it?

I'd be wary that this could break existing code.

timcassell commented 2 weeks ago

I think the calculation that determines alignment/packing here is not 'seeing through' the struct as desired.

That does seem like it could be the case both for this issue and at least 2 of the linked issues.

I'm not sure you could actually generalize a rule like that without having nasty consequences in cases where you're passing refs or pointers around

I don't understand. How would passing refs affect it? The struct itself is always at a fixed offset with a fixed size. I don't think it should matter whether the struct is at the beginning of the class or the end.

the size of a struct within a containing type probably needs to always be the same as its size on the heap.

I agree. In fact, the bug described here seems to violate that.

kg commented 2 weeks ago

I'm not sure you could actually generalize a rule like that without having nasty consequences in cases where you're passing refs or pointers around

I don't understand. How would passing refs affect it? The struct itself is always at a fixed offset with a fixed size. I don't think it should matter whether the struct is at the beginning of the class or the end.

The ideal behavior (without context) would be to have no padding anywhere, but in some cases you would need padding. I.e. you have a struct with an ideal size of 3 bytes; the heap allocation if you box it can't be 3 bytes, it will get rounded up to some larger allocation size most likely. So at that point the size of it on the heap and the size of it within an enclosing type have become different and you have to be careful not to accidentally over-read or under-write when doing initialization. You'd also potentially run into problems then with people mixing up the different types of sizeof's, am i using the boxed size when i meant the in-place size? did i mean the marshal size, which might be different from both because it contains a bool? Am I thinking in terms of array layout instead of field layout? etc.

I ran into this previously on netfx writing an InlineArray equivalent specialized for 4 elements. The spacing between fields was not what I expected in some cases depending on the element type, and as a result I observed memory corruption when using Unsafe.Add. Mostly a lesson learned but a tricky one since you might naively assume that arrays and sequential fields have the same layout, but they don't necessarily.

timcassell commented 2 weeks ago

Considering this is for auto layout, I wouldn't even consider the implications of unsafe APIs.

tannergooding commented 2 weeks ago

struct wrappers and primitive fields are not always identical, even for Sequential, interop, and other scenarios. In this particular case they should be, and this is likely similar to https://github.com/dotnet/runtime/issues/109585; but it is an important distinction to call out.