mystor / rust-cpp

Embed C++ directly inside your rust code!
Apache License 2.0
798 stars 44 forks source link

Wrong ABI used for small C++ classes. #18

Closed vadimcn closed 6 years ago

vadimcn commented 7 years ago

So if I generate C++ types using bindgen as described in https://github.com/rust-lang-nursery/rust-bindgen/issues/778, and then try to create and return such types via cpp!, I hit the exact same problem as described in that bug.

This code crashes with a segmentation fault, because C++ compiler wants to return Bar by-pointer, whereas Rust expects it in registers.

let bar = cpp!([] -> Bar as "Bar" { return MakeBar(); });

In retrospect, this kinda makes sense, because even extern "C" functions in .cpp files must obey C++ ABI.

Not sure if cpp crate should attempt to fix this, but something to keep in mind if you decide to implement "foreign types" from #16.

vadimcn commented 7 years ago

@mystor: So I've been thinking about this...

We could use a wrapper like this one to force the C++ compiler to return a value as "raw bytes":

    template<typename T>
    struct AsBytes {
        uint8_t bytes[sizeof(T)];

        AsBytes(const T& value) {
            memcpy(bytes, &value, sizeof(T));
        }
        AsBytes(T&& value) {
            memcpy(bytes, &value, sizeof(T));
        }
    };

In the simplest case, all closures would have to change their return types from T to AsBytes<T>. On the Rust side, we'd need to emit code to convert bytes back to properly aligned data.

Of course, this is inefficient and will hurt optimization on both sides. To deal with that, we could use type traits to detect potentially problematic types and wrap only those. The filter condition needs to be something like this:

    !std::is_trivially_copy_constructible<T>::value ||
    !std::is_trivially_destructible<T>::value ||
    std::is_polymorphic<T>::value

Thoughts?

DemiMarie commented 7 years ago

Could we use libclang to get the information we need?

mystor commented 7 years ago

@vadimcn I think the easier solution would be to simply design the call surface such that we don't need to worry about these ABI concerns, by making sure to tightly control the types entering and leaving the function.

For example, we could instead generate a function sorta like this (untested :-S):

cpp!([arg1 as "A", arg2 as "B"] -> T as  "T" {
    return frob(a, b);
});
void closure(A& arg1, B& arg2, T* ret) {
    new(ret) T([&] -> T {
        return frob(a, b);
    })());
}

Which I think should move construct the return value into the outparam, which would allow us to avoid ABI concerns. It might have a perf impact for simple types though - I would want to look into it.

@DemiMarie

I would prefer to avoid rust-cpp having a dependency on clang or libclang for building, as it's nice that it doesn't need a specific C++ compiler at the moment. If I do add some libclang components, it'll be optional and only used for cross-platform linking I would think.

I think it also should be possible for getting some of this information without needing libclang right now.

vadimcn commented 7 years ago

@mystor: yeah, I was thinking of perf impact. Would be nice to continue returning ints and the like in registers.

@DemiMarie: libclang doesn't expose this information. Even using clang's C++ API, doing this might be challenging, because it looks like ABI info does not get computed until codegen stage.

DemiMarie commented 7 years ago

Could we handle this by trying to compile a program, such that the program compiles if and only if it can be returned in a register?

On Sep 11, 2017 12:38 PM, "Michael Layzell" notifications@github.com wrote:

@vadimcn https://github.com/vadimcn I think the easier solution would be to simply design the call surface such that we don't need to worry about these ABI concerns, by making sure to tightly control the types entering and leaving the function.

For example, we could instead generate a function sorta like this (untested :-S):

cpp!([arg1 as "A", arg2 as "B"] -> T as "T" { return frob(a, b); });

void closure(A& arg1, B& arg2, T* ret) { new(ret) T([&] -> T { return frob(a, b); })()); }

Which I think should move construct the return value into the outparam, which would allow us to avoid ABI concerns. It might have a perf impact for simple types though - I would want to look into it.

@DemiMarie https://github.com/demimarie

I would prefer to avoid rust-cpp having a dependency on clang or libclang for building, as it's nice that it doesn't need a specific C++ compiler at the moment. If I do add some libclang components, it'll be optional and only used for cross-platform linking I would think.

I think it also should be possible for getting some of this information without needing libclang right now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mystor/rust-cpp/issues/18#issuecomment-328586342, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGWB1mSarPoXbzRLJ_EA-PTKQRotZgxks5shWH2gaJpZM4O1Qwi .

mystor commented 6 years ago

This should be fixed by #22