rust-lang / rust-bindgen

Automatically generates Rust FFI bindings to C (and some C++) libraries.
https://rust-lang.github.io/rust-bindgen/
BSD 3-Clause "New" or "Revised" License
4.47k stars 697 forks source link

Bindgen generates incorrect return types for C++ class constructors in wasm32-unknown-emscripten #1822

Closed tuxmark5 closed 4 years ago

tuxmark5 commented 4 years ago

Bindgen generates incorrect return types for C++ class constructors in wasm32-unknown-emscripten, which causes linker warnings or even errors (when linking with -Wl,--fatal-warnings)

Input C/C++ Header

class Foo {
public:
  Foo(int x);
};

Foo::Foo(int x) { }

custom.rs

mod bindings;

pub extern "C" fn create_foo() {
    let _ = unsafe { bindings::Foo::new(0) }; 
}

bindings.rs

/* automatically generated by rust-bindgen */

#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct Foo {
    pub _address: u8,
}
#[test]
fn bindgen_test_layout_Foo() {
    assert_eq!(
        ::std::mem::size_of::<Foo>(),
        1usize,
        concat!("Size of: ", stringify!(Foo))
    );
    assert_eq!(
        ::std::mem::align_of::<Foo>(),
        1usize,
        concat!("Alignment of ", stringify!(Foo))
    );
}
extern "C" {
    #[link_name = "\u{1}_ZN3FooC1Ei"]
    pub fn Foo_Foo(this: *mut Foo, x: ::std::os::raw::c_int);
}
impl Foo {
    #[inline(never)] // I added inline(never) to make this reproducable
    pub unsafe fn new(x: ::std::os::raw::c_int) -> Self {
        let mut __bindgen_tmp = ::std::mem::MaybeUninit::uninit();
        Foo_Foo(__bindgen_tmp.as_mut_ptr(), x);
        __bindgen_tmp.assume_init()
    }
}

Bindgen Invocation

$ em++ -c class.cpp -o class.o
$ bindgen class.cpp --generate constructors,types --output bindings.rs
$ rustc --emit=obj --crate-type=lib --target wasm32-unknown-emscripten custom.rs -o bindings.o
$ emcc bindings.o class.o

Actual Results

wasm-ld: warning: function signature mismatch: _ZN3FooC1Ei
>>> defined as (i32, i32) -> void in bindings.o
>>> defined as (i32, i32) -> i32 in class.o

Expected Results

No warnings/errors.

It appears that C++ constructors have an internal return value (which seemingly equals to this). However bindgen assumes that C++ constructors always return void, which causes function signature mismatches when linking code generated with bindgen with actual C++ object files.

I encountered this issue when trying to port skia-safe (which uses bindgen) to wasm32-unknown-emscripten.

emilio commented 4 years ago

Hmm... Is this a wasm-only thing? Or do actual C++ constructors end up with a this return type in clang too?

tuxmark5 commented 4 years ago

I've seen similar behavior on ARM a while ago, when trying to disassemble a certain binary.

However after digging for a bit in compiler explorer, it seems that this is wasm (or even wasm + clang) specific behavior.

tuxmark5 commented 4 years ago

Would it be possible to fix this issue? As I have to maintain a local version of bindgen to get skia-safe to compile for emscripten.

I'm using such patch to src/ir/function.rs:

-        let ret = Item::from_ty_or_ref(ty_ret_type, cursor, None, ctx);
+        let ret = if is_constructor { 
+            let void = Item::builtin_type(TypeKind::Void, false, ctx);
+            Item::builtin_type(TypeKind::Pointer(void), false, ctx)
+        } else {
+            Item::from_ty_or_ref(ty_ret_type, cursor, None, ctx)
+        };

For actual implementation it would need to be guarded to only activate on wasm32 builds

emilio commented 4 years ago

We have access to the target triple when we construct the BindgenContext, so it should be relatively easy to keep track of it and check it there. If you send a PR I'd happily merge it :)