randomPoison / cs-bindgen

Experiment in using Rust to build a library that can be loaded by Unity across multiple platforms
4 stars 0 forks source link

Remove the need for list conversion functions #66

Open randomPoison opened 4 years ago

randomPoison commented 4 years ago

While thinking about #65, I realized that we probably need a different approach to marshaling lists. Repeating what I said in a comment there:

Right now the semantics of the marshaling logic requires that ownership be transferred across the FFI boundary. This came up most notably with Vec, where the from_abi impl for Vec assumes that the RawVec it receives was allocated on the Rust side from a valid Vec. This means that when we're passing a C# List<T> to a Rust function expecting a Vec<T>, we must first convert the List<T> to a RawSlice<T::Abi> and then pass that to an exported Rust function that can convert the intermediate representation to the Rust representation, allocate a Vec<T>, and return it to the C# side as a RawVec. This approach requires a lot of eager monomorphization to create a vec conversion function for each exported type, and doesn't cover all cases.

Most relevant here is that we can't generate conversion functions for lists of tuples, since we can't monomorphize composite types ahead of time. That means that even if we convert a Dictionary<K, V> to a RawSlice<(K::Abi, V::Abi) on the C# side, we can't generate a monomorphized list conversion function for (K::Abi, V::Abi) ahead of time, so we can't convert the intermediate list into the appropriate Rust Vec.

I think this speaks to a failure in the list conversion approach more generally. Really, we don't want to have to do so much eager monomorphization since it clearly can't cover all cases and will result in a lot of bloat in the generated code that can't be optimized out since it's exported from the generated dylib. The core design of the ABI conversion logic is built around the idea that we can still support generic types as long as they are fully monomorphized in the exported interface; An exported function can't itself be generic, but it can contain generic types. Requiring eager monomorphization for collection types makes this difficult to support because it means things break down around composite types like tuples.

Fleshing out that idea some more, the main issue I ran into that led to the current list conversion approach was that the current marshaling logic assumes that the input to from_raw will be exactly what was produced by into_raw. This is a valid assumption for most user-defined types, but collection types and other types that allocate are asymmetric in the way that they're marshaled: When doing into_raw, we're actually passing a pointer to Rust-allocated memory over to the C# side and then the C# code will have to pass it back to Rust after the conversion so that the allocation can be freed. When doing from_raw, the list of values was allocated on the C# side and so cannot be directly converted into a collection on the Rust side.

The original design for ABI actually supported this better, too! We previously defined a different intermediate representation for a type for from_raw vs to_raw (they were even different traits, FromAbi and IntoAbi). This would have allowed Vec<T> to be passed to C# as a RawVec, while allowing C# to pass it back as a RawSlice. I removed this approach in #37 (motivation described in #35), unifying both into the single Abi trait we have today. At this point it seems like we'll probably have to undo this change (we can probably keep the single Abi trait, but we'll need separate associated types for from-raw and to-raw representations).

The other question this leaves us with is how to free allocations on the Rust side when returning a collection type to C#. Currently the approach here is also to eagerly generate monomorphized drop functions for Vec<T> for primitives and exported types, but this breaks down for tuples the same way that list conversion does. My current guess is that we might be able to manually free the allocation for a Vec by using alloc::dealloc directly. If we embed the Layout of the allocation in the RawVec that we pass to C#, we can export a single, type-erased function for freeing a Rust-allocated Vec from C#. At least in theory this seems pretty doable, though I suspect we'd be running the risk of hitting undefined behavior if the internals of Vec ever change such that using dealloc directly isn't correct. A more stable (though more complicated?) approach would be to instead pass a type-erased callback in the RawVec, something equivalent to a &dyn Drop that the C# side can invoke.