rust-lang / rust-bindgen

Automatically generates Rust FFI bindings to C (and some C++) libraries.
https://rust-lang.github.io/rust-bindgen/
BSD 3-Clause "New" or "Revised" License
4.39k stars 691 forks source link

Feature request: use DST to represent structures with flexible array members #2771

Open jsgf opened 7 months ago

jsgf commented 7 months ago

Input C/C++ Header

struct msg {
    unsigned int tag;
    unsigned int len;
    char payload[];
};

Bindgen Invocation

$ bindgen input.h

Actual Results

This produces an output using __IncompleteArrayField as expected:

#[repr(C)]
#[derive(Debug)]
pub struct msg {
    pub tag: ::std::os::raw::c_uint,
    pub len: ::std::os::raw::c_uint,
    pub payload: __IncompleteArrayField<::std::os::raw::c_char>,
}

Expected Results

It would be nice to have an option to model this by using a structure with a dynamic tail:

#[repr(C)]
#[derive(Debug)]
pub struct msg {
    pub tag: ::std::os::raw::c_uint,
    pub len: ::std::os::raw::c_uint,
    pub payload: [:std::os::raw::c_char],
}

These types of structures are awkward to use on their own, so it would need some helper methods:

impl msg {
    // Size and alignment so memory can be allocated
    fn layout(len: usize) -> ::std::alloc::Layout {
        unsafe {
            let p: *const Self = ::std::ptr::from_raw_parts(ptr::null(), len);
            ::std::alloc::Layout::for_value_raw(p)
        }
    }
    // Convert a raw pointer into an immutable view for a given number of payload elements.
    // Danger: Return lifetime unconstrained
    unsafe fn from_ptr<'a>(ptr: *const (), len: usize) -> &'a Self {
        let ptr: *const Self = ::std::ptr::from_raw_parts(ptr, len);
        &*ptr
    }
    // Same but mutable, and maybe uninitialized, to to allow Rust to initialize/mutate it.
    // Danger: Return lifetime unconstrained
    unsafe fn from_ptr_mut<'a>(
        ptr: *mut (),
        len: usize,
    ) -> ::std::mem::MaybeUninit<&'a mut Self> {
        let ptr: *mut Self = ::std::ptr::from_raw_parts_mut(ptr, len);
        ::std::mem::MaybeUninit::new(&mut *ptr)
    }
}

These methods would currently depend on the unstable ptr_metadata and layout_for_ptr features.

jsgf commented 7 months ago

Prototype implementation in #2772

emilio commented 7 months ago

Hmm, this seems fine, but maybe we should just make the implementation of __IncompleteArrayField use them or so? Would that not be feasible?

jsgf commented 7 months ago

That's a possibility but there's a few downsides:

Specifically for the second, since these become DST types, it means it's no longer possible to directly reference them by value, so you wouldn't be able to do something like:

let hdr = msg { ... set up fields ...};
// (something to set up payload)
*ptr = hdr;

That's not the end of the world of course - that's why from_ptr_mut exists.

The other problem is that all pointers and references to these objects will be fat, and so not C-ABI compatible. So something like:

struct controlblock {
    struct msg *messagequeue;
};

could not be directly converted to:

#[repr(C)]
struct controlblock {
    messagequeue: *mut msg,
}

Currently this proposed API is using *mut/const () as the thin placeholder, but we could define associated types for typed thin variants, eg:

#[repr(C)]
struct controlblock {
    messagequeue: *mut msg::Thin,
}

Likewise you wouldn't be able to directly pass or return the fat pointers to/from C-ABI functions.

I don't think any of these are show-stoppers; structures with flex array have always been a bit awkward to deal with, and I think this API is an accurate reflection of that (ie, it would make it harder to get things wrong). If we could wave a magic wand to stabilize the features and update all existing code to the new API, then I think we could replace __IncompleteArrayField entirely.

emilio commented 7 months ago

Yeah to be clear I meant to make __IncompleteArrayField optionally a DST, with the same-ish API to consumers, and we'd keep the current version as default at least for a bit. But yeah I guess you'd need from_ptr and such for the new version anyways so maybe it's not such a great idea.

jsgf commented 7 months ago

BTW I also experimented with this idea using a trait (https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=114e0ca2cbc10523037946a7d9c61a06) , since it's a fairly self-contained abstraction that you might want to generalize over.

But as far as I know bindgen doesn't have a support library crate where this could live, and it doesn't seem like it makes sense to emit a separate variant of this trait for each bindgenned module - that would be no better than the current proposal just using intrinsic methods.

jsgf commented 7 months ago

I've published PR #2772 with a fairly complete implementation of this idea.

jsgf commented 6 months ago

I think #2772 is ready for review.