microsoft / CsWin32

A source generator to add a user-defined set of Win32 P/Invoke methods and supporting types to a C# project.
MIT License
1.99k stars 84 forks source link

structs with VariableLengthInlineArray and SizeOf(0) #1179

Open dorssel opened 1 month ago

dorssel commented 1 month ago

C structures with a flexible array member at the end (array of [0] elements) get translated to a VariableLengthInLineArray. As an example, I take USB_DESCRIPTOR_REQUEST, which in C is:

typedef struct _USB_DESCRIPTOR_REQUEST {
    ULONG ConnectionIndex;
    struct {
        UCHAR bmRequest;
        UCHAR bRequest;
        USHORT wValue;
        USHORT wIndex;
        USHORT wLength;
    } SetupPacket;
    UCHAR Data[0];
} USB_DESCRIPTOR_REQUEST, *PUSB_DESCRIPTOR_REQUEST;

I know about the discussion that in C# (using CsWin32) the VariableLengthInLineArray always has 1 member (as if UCHAR Data[1] instead of UCHAR Data[0]), and therefore the sizeof(C# struct) does not match the sizeof(C struct). And that's OK, since we now have a convenience member function SizeOf for such structures that takes a count parameter for the real number of elements in the array.

However, the function is defined as:

/// <summary>Computes the amount of memory that must be allocated to store this struct, including the specified number of elements in the variable length inline array at the end.</summary>
internal static unsafe int SizeOf(int count)
{
    int v = sizeof(USB_DESCRIPTOR_REQUEST);
    if (count > 1)
        v +=checked((count - 1) * sizeof(byte));
        else
            if (count < 0)
                throw new ArgumentOutOfRangeException();
    return v;
}

This does not return the correct size for 0; SizeOf(0) is still off by 1 array element. In fact SizeOf(0) == SizeOf(1) in the current implementation. Note that for the example USB_DESCRIPTOR_REQUEST, 0 is actually a valid size for the array (as in: the data is optional).

I don't know if this is deliberate, or actually a bug.

Could SizeOf(0) be changed so it returns sizeof(C# struct) - sizeof(the 1 element in VariableLengthInLineArray) == sizeof(C struct)? Something like:

/// <summary>Computes the amount of memory that must be allocated to store this struct, including the specified number of elements in the variable length inline array at the end.</summary>
internal static unsafe int SizeOf(int count)
{
    int v = sizeof(USB_DESCRIPTOR_REQUEST);
    if (count >= 0)
        // NOTE: the -1 for count == 0 is intentional, to compensate for the omnipresent 1 element in VariableLengthInLineArray
        v +=checked((count - 1) * sizeof(byte));
    else
        throw new ArgumentOutOfRangeException();
    return v;
}
AArnott commented 1 month ago

That's an interesting problem. Am I correct in supposing it's important for you to get the size of the struct that reflects a 0 length trailing buffer specifically, such that passing a size that implies a length of 1 causes a malfunction? Or is this just to 'feel good' about the size matching what C would say?

It's potentially problematic to change the behavior to match what you say, because to C#, the VariableLengthInlineArray struct is one element in length. Suppose then that you use SizeOf(0) to determine the size of a buffer you needed to allocate, and then you copied a USB_DESCRIPTOR_REQUEST into it from C#. That would be a buffer overrun because the runtime would copy USB_DESCRIPTOR_REQUEST plus one element, since that's how the type is defined in C#. The SizeOf method was written to reflect the size of the memory that must be allocated from C#. It wasn't meant for initializing a field on the native struct that would represent the whole size of the struct plus buffer (given a 0 length buffer, anyway).

If this really is important, we may need to introduce another function alongside SizeOf with a very clear name and xml docs so folks know which one to call when.

dorssel commented 1 month ago

My use case could definitely be rewritten to use the current way CsWin32 generates the struct. But before CsWin32 supported the USB structures I already had my own definition as here:

https://github.com/dorssel/usbipd-win/blob/d064ea10aa3a17acfe83537e98b078214f05e75a/Usbipd/Interop/WinSDK.cs#L18-L33

I use it "C-style" to custom marshal the data that comes after it, which is not really byte data, but instead another structure. And with "C-style" I mean: use offset into the byte representation to get to the start of that following data struct. And that offset in C is just sizeof(USB_DESCRIPTOR_REQUEST). But not in C#, where there currently is no way to get the that original size (except subtracting the 1 byte manually). Here is how I use it (note: this code predates the CsWin32 support for these structures.

https://github.com/dorssel/usbipd-win/blob/d064ea10aa3a17acfe83537e98b078214f05e75a/Usbipd/ExportedDevice.cs#L79-L122

I am slowly trying to get rid of all my manual WinSDK "defines" and move to CsWin32 for everything. This thing just struck me. And like I said, I can easily rewrite it to work with CsWin32. I was just wondering why it is not possible to get the original C size for "sizeof(USB_DESCRIPTOR_REQUEST)".

On a side note: what if, instead of adding the empty struct with [] operator (which VariableLengthInlineArray really is) you would implement the 0-size array last element as member function. Then you could do something like:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
internal ref TData FlexibleArrayMember(int index)
{
    if (index < 0) {
        throw new ArgumentOutOfRangeException(nameof(index));
    }
    unsafe
    {
        fixed (STRUCT_TYPE* ptr = &this)
        {
            return ref *(((TData *)(ptr + 1)) + index);
        }
    }
}

Instead of addressing the flexible array like Data[index] it would be Data(index), but at no cost of the one extra byte for the empty struct. And such a member function should probably be marked as unsafe (even though its prototype itself does not require that).

And on that last note: why is the SizeOf() helper marked as unsafe. I think it shouldn't be...

AArnott commented 1 month ago

I think you have some good ideas here. I haven't the time this week probably, but I'll review more closely and see what we can adopt.

dorssel commented 1 month ago

Thanks. But that's what they realy are ... ideas ... not proposals. Like I said, I could use the generated types as they are by just rewriting a little bit of code. But if you are interested in the "flexible array without a 1-byte empty struct" idea. There are some corner cases to consider. Such as alignment, for example a C struct:

UCHAR a;
UCHAR b;
UCHAR c;
DWORD array[0];

In C this will have alignment 4, size 4, and a byte of padding between c and array. But if you simply make a C# struct with the first 3 members and a kludge method like I suggested, then of course you need something like pack(4). In that case a simple dummy padding member is not enough, as that would make the size 4, but still have alignment 1.