Michael-F-Bryan / rust-ffi-guide

A guide for doing FFI using Rust
https://michael-f-bryan.github.io/rust-ffi-guide/
Creative Commons Zero v1.0 Universal
282 stars 18 forks source link

Guide Rewrite #64

Open Michael-F-Bryan opened 6 years ago

Michael-F-Bryan commented 6 years ago

It's been a while since I last updated the guide and I've learned a lot since then. I think it may be a good idea to start over and take a slightly different approach.

I think we may want to break things into two parts. In the first part we'll address common patterns and techniques used when writing FFI code, with the second part being a worked example (or two?) that lets us show off these techniques.

Progress so far (rendered)


Concepts and Patterns:

Ideas for worked examples:

Note: All PRs should be made against the reboot branch (#65) so we don't overwrite the existing guide until the new version is ready.

richard-uk1 commented 6 years ago

I think a good example would be using the system blas/lapack libs to do some number crunching. They take flat datastructures so allocation is a bit more simple. These libs can be highly optimized, so its something you might want to do :).

Michael-F-Bryan commented 6 years ago

@derekdreery that's a really good idea! I wrote an example that wraps libmagic (the library behind the file command) to showcase linking to a C library, but perhaps I should rewrite it to use blas?

richard-uk1 commented 6 years ago

@Michael-F-Bryan I wrote a really long response that github/my browser conspired to delete :(.

I think your example is fine. I'd add that the call to magic_close in the failure path (when magic.cookie.is_null()) might try to free a null pointer. In this case I'm sure calling the close function is correct, but it might be worth mentioning that this is not automatically the case - you might need to check the C source.

EDIT Oh, and also implementing Display and Error on error types is nice.

Michael-F-Bryan commented 6 years ago

Thanks for spotting the bug @derekdreery, I had another look at the code and fixed the issue. I believe it steps from the fact that I was creating a Magic object which wasn't 100% guaranteed to be in a valid state, all to save having to copy/paste the MAGIC_CREATED.store(true, Ordering::Relaxed) call.

anxiousmodernman commented 6 years ago

I remember coming across the reddit post, but I just recently needed this rewrite of your guide. The rewrite is a little less Qt focused.

I am trying to embed the Jim Tcl interpreter, so I am wrapping a long-lived C pointer that is managing it's own world under the hood.

That's all to say, this work has been helpful for me!

Michael-F-Bryan commented 6 years ago

Thanks @anxiousmodernman!

I'm actually planning to add a worked example of embedding an interpreter in Rust, so your experience may be useful. Is there anything you think others would benefit from knowing when trying to embed one language within another?

Normally I'd reach for Lua here, but it feels a bit like cheating since I've already done lots with embedding Lua in Rust and because the language is actually designed for that purpose.

anxiousmodernman commented 6 years ago

My first attempt was straightforward: to write a .so library in Rust that will be auto-loaded by the scripting language. This lets us write language extensions in Rust that present themselves as "importable" libraries in Jim Tcl.

This works, but it assumes the user has the interpreter installed at the system level. My next attempt is in the design phase right now: how can we 1) link all the interpreter code into our rust binary and 2) generically specify, in Rust, the structure of a language extension, so that folks can merely implement a trait, and load those extensions/DSLs in to the interpreter at runtime. Something like:

let interp = Interpreter::new();
let my_dsl = MyDSL::new();
interp.load_ext(my_dsl)?;
let val = interp.eval("some custom language")?;

Jim has a complicated ./configure script that needs to be run before compiling with make. Right now in build.rs I am literally shelling out to ./configure/make with two Command structs in Rust. I'd like to use the cc crate, but I'm not confident that I've reverse-engineered the autotools build setup :grimacing: .

Generically specifying the structure of a command will be an interesting challenge. I will post here if I figure it out in my next version.

ivnsch commented 4 years ago

Could you also thematize std::mem::forget maybe? I've seen it used to pass control of the memory management to the caller (which would be c++ is your examples).

Michael-F-Bryan commented 4 years ago

@i-schuetz can you give me an example? Shouldn't you be using something like Box::into_raw() to pass ownership of an object to C++? (i.e. use some indirection and pass the pointer around)

You shouldn't be passing a Rust object with a destructor by value across the FFI boundary because it'll most likely be #[repr(R)] and the foreign code won't know how to clean it up properly, and if it doesn't have a destructor (i.e. it's Copy) then you don't need mem::forget().

Another more satisfying approach is to use std::mem::MaybeUninit or std::mem::ManuallyDrop.

ivnsch commented 4 years ago

@Michael-F-Bryan Thanks for the quick reply! Example:

use std::os::raw::{c_char};
use core_foundation::string::{CFString, CFStringRef};
use core_foundation::base::TCFType;
// ...

#[no_mangle]
pub unsafe extern "C" fn get_reports(interval_number: u32, interval_length: u32) -> CFStringRef {
    let result = COMP_ROOT.api.get_reports(interval_number as u64, interval_length as u64);

    let lib_result = match result {
      Ok(success) => LibResult { status: 200, data: Some(success), error_message: None },
      Err(e) => LibResult { status: e.http_status, data: None, error_message: Some(e.to_string()) }
    };

    let lib_result_string = serde_json::to_string(&lib_result).unwrap();

    let cf_string = CFString::new(&lib_result_string);
    let cf_string_ref = cf_string.as_concrete_TypeRef();

    ::std::mem::forget(cf_string);

    return cf_string_ref;
}

(I'm not 100% sure it works correctly, adopted it from other code. If it's incorrect I'd appreciate clarification / what to use instead)

Note that this is to communicate with an iOS app, thus the core_foundation dependencies.

Michael-F-Bryan commented 4 years ago

That as_concrete_TypeRef() and std::mem::forget() dance feels a bit odd.

Ideally, CFString would have some sort of into_raw() method which consumes the CFString and returns a *const __CFString (the type behind the CFStringRef alias) to indicate ownership of the string is being passed from Rust to the caller. Sure, it would call std::mem::forget() under the hood so you don't run the constructor, but it shouldn't be something the end user needs to do.


Also... Try to avoid using unwrap() in an extern "C" function. It's UB for Rust to unwind across the FFI boundary, so if anything went wrong with serialising lib_result to JSON you'd be setting yourself up for a bad time.

If you're lucky the unwinding code will detect that it's leaving a Rust stack frame and abort (or not), otherwise there's a good chance you'll have a corrupted stack or other fun issues, although it tends to be less UB on Windows because panicking is kinda handled by the OS.

ivnsch commented 4 years ago

Thanks @Michael-F-Bryan! Yeah, I don't know why CFString doesn't have (/ seem to have) a method like that. But it's useful to know in general that std::mem::forget() can be used / is safe in this context (and IMO would probably merit being mentioned somewhere in this guide. Maybe std::mem::ManuallyDrop too).

And yeah, I have to remove those unwrap()... It's "draft" code. Good to know that unwinding across FFI is such a big deal, will be extra careful.

Have you thought about adding a section for mobile (iOS / Android, essentially essentially Core Foundation and JNI) apps to this guide? Rust is a really good fit for cross-platform domain logic / services in mobile apps and a nice alternative to Kotlin Native (and partly to React Native, Flutter, etc.), in particular because of its low footprint, performance and that it can be reused for Web too via WebAssembly or even Desktop apps. Having a dedicated guide would be very helpful and possibly popularize this approach.