ZettaScaleLabs / stabby

A Stable ABI for Rust with compact sum-types
Other
328 stars 13 forks source link

Crates.io (latest) docs.rs

[!WARNING]
Due to a breaking change in Rust 1.78, stabby's implementation of trait objects may raise performance issues:

  • Only non-nightly, >= 1.78 versions of Rust are affected
  • The v-tables backing trait objects are now inserted in a global lock-free set.
  • This set is leaked: valgrind will be angry at you.
  • This set grows with the number of distinct (type, trait-set) pairs. Its current implementation is a vector:
    • Lookup is done through linear search (O(n)), which stays the fastest for <100 number of elements.
    • Insertion is done by cloning the vector (O(n)) and replacing it atomically, repeating the operation in case of collision.
    • Efforts to replace this implementation with immutable b-tree maps are ongoing (they will be scrapped if found to be much slower than the current implementation).

This note will be updated as the situation evolves. In the meantime, if your project uses many stabby-defined trait objects, I suggest using either nightly or a < 1.78 version of the compiler.

A Stable ABI for Rust with compact sum-types

stabby is your one-stop-shop to create stable binary interfaces for your shared libraries easily, without having your sum-types (enums) explode in size.

Your main vector of interaction with stabby will be the #[stabby::stabby] proc-macro, with which you can annotate a lot of things.

Why would I want a stable ABI? And what even is an ABI?

ABI stands for Application Binary Interface, and is like API's more detail focused sibling. While an API defines what type of data a function expects, and what properties these types should have; ABI defines how this data should be laid out in memory, and how a function call even works.

How data is laid out in memory is often called "representation": field ordering, how variants of enums are distinguished, padding, size... In order to communicate using certain types, two software units must agree on what these types look like in memory.

Function calls are also highly complex under the hood (although it's rare for developers to need to think about them): is the callee or caller responsible for protecting the caller's register from callee's operations? In which registers / order on the stack should arguments be passed? What CPU instruction is used to actually trigger the call? A set of replies to these questions (and a few more) is called "calling convention".

In Rust, unless you explicitly select a known representation for your types through #[repr(_)], or an explicit calling convention for your functions with extern "_", the compiler is free to do whatever it pleases with these aspects of your software: the process by which it does that is explicitly unstable, and depends on your compiler version, the optimization level you selected, some llama's mood in a wool farm near Berkshire... who knows?

The problem with that comes when dynamic linkage is involved: since the ABI for most things in Rust is unstable, software units (such as a dynamic library and the executable that requires it) that have been built through different compiler calls may disagree on these decisions about ABI, even though there's no way for the linker to know that they did.

Concretely, this could mean that your executable thinks the leftmost 8 bytes of Vec<Potato> is the pointer to the heap allocation, while the library believes them to be its length. This could also mean the library thinks it's free to clobber registers when its functions are called, while the executable relied on it to save them and restore them before returning.

stabby seeks to help you solve these issues by helping you pin the ABI for a subset of your program, while helping you retain some of the layout optimizations rustc provides when using its unstable ABI. On top of this, stabby allows you to annotate function exports and imports in a way that also serves as a check of your dependency versioning for types that are stabby::abi::IStable.

Structures

When you annotate structs with #[stabby::stabby], two things happen:

Enums

When you annotate an enum with #[stabby::stabby], you may select an existing stable representation (like you must with abi_stable), but you may also select #[repr(stabby)] (the default representation) to let stabby turn your enum into a tagged-union with a twist: the tag may be a ZST that inspects the union to emulate Rust's niche optimizations.

Note that #[repr(stabby)] does lose you the ability to pattern-match.

Due to limitations of the trait solver, #[repr(stabby)] enums have a few paper-cuts:

#[repr(stabby)] enums are implemented as a balanced binary tree of stabby::result::Result<Ok, Err>, so discriminants are always computed between two types through the following process:

Unions

If you want to make your own internally tagged unions, you can tag them with #[stabby::stabby] to let stabby check that you only used stable variants, and let it know the size and alignment of your unions. Note that stabby will always consider that unions have no niches.

Traits

When you annotate a trait with #[stabby::stabby], an ABI-stable v-table is generated for it. You can then use any of the following type equivalence:

Note that vtable!(Traits) and dynptr!(..dyn Traits..) support any number of traits: vtable!(TraitA + TraitB<Output = u8>) or dynptr!(Box<dyn TraitA + TraitB<Output = u8>>) are perfectly valid, but ordering must remain consistent.

However, the v-tables generated by stabby will not take super-traits into account.

In order for stabby::dynptr!(Box<dyn Traits + 'a>) to have Trait's methods, you will need to use trait::{TraitDyn, TraitDynMut};, so make sure you don't accidentally seal these traits which are automatically declared with the same visibility as your Trait.

stabby::closure exports the CallN, CallMutN and CallOnceN traits, where N (in 0..=9) is the number of arguments, as ABI-stable equivalents of Fn, FnMut and FnOnce respectively.

Since version 1.0.1, the v-tables generated by #[stabby::stabby] always assume all of their method arguments to be ABI-stable, to prevent the risk of freezing rustc. Unless your trait has methods referencing its own v-table, it's advised to use #[stabby::stabby(checked)] instead to avoid the v-table being marked as stable despite some types in its interface not actually being stable.

Since stabby generates an additional struct when placed on a trait, you may want to add attributs to that struct's declaration. You can do so, you can use #[stabby::vt_attr(your_attribe = "goes here")]. Any other attribute will be placed on the original trait.

Functions

#[stabby::stabby]

Annotating a function with #[stabby::stabby] makes it extern "C" (but not #[no_mangle]) and checks its signature to ensure all exchanged types are marked with stabby::abi::IStable. You may also specify the calling convention of your choice.

#[stabby::export]

Works just like #[stabby::stabby], but will add #[no_mangle] to the annotated function, and produce two other no-mangle functions:

#[stabby::export(canaries)]

Works on any function, including ones that would be FFI-unsafe. On top of adding #[no_mangle] to the original function, it will add a small set of <fn_name>_<canary> symbols to the produced shared libraries. These canaries include rustc's version, the optimization level, and other properties that may cause the compiler to use a different ABI for <fn_name>.

The presence of these symbols can then be checked for by the linker when loading the shared library, preventing linkage when the loader requests canaries with incompatible versions.

#[stabby::import(...)]

Annotating an extern block with this is equivalent to #[link(...)], except the symbols will be lazy-initialized by using <fn_name>_stabbied, ensuring that the reports on the functions parameters match before letting you call it.

If you want to handle potential mismatch errors without panicking, you can call <fn_name>.as_ref(), which will let you inspect the reports for <fn_name> in case of failure.

#[stabby::import(canaries="rustc, opt_level", ...)]

Annotating an extern block with this is equivalent to #[link(...)], but the canaries corresponding to your spec will be required for linkage to be possible. This mirrors export(canaries), which always exports all available canaries, but you can choose which canaries you want to enable from the following set:

The stabby::libloading::StabbyLibrary trait

Additional methods for libloading::Library that expose symbol getters which will fail if the canaries are absent, or in case of a report mismatch.

These methods are still considered unsafe, but they will reduce the risks of accidentally loading ABI-incompatible code. Reports also act as a runtime type-check, reducing the risk of mistyping a symbol.

Async

Any implementation of core::future::Future on a stable type will work regardless of which side of the FFI-boundary that stable type was constructed. However, futures created by async blocks and async functions aren't ABI-stable, so they must be used through trait objects.

stabby supports futures through the stabby::future::Future trait. Async functions are turned by #[stabby::stabby] into functions that return a Dyn<Box<()>, vtable!(stabby::future::Future + Send + Sync)> (the Send and Sync bounds may be removed by using #[stabby::stabby(unsync, unsend)]), which itself implements core::future::Future.

stabby doesn't support async traits yet, but you can use the following pattern to implement them:

use stabby::{slice::SliceMut, future::DynFuture};
#[stabby::stabby]
pub trait AsyncRead {
    fn read<'a>(&'a mut self, buffer: SliceMut<'a, [u8]>) -> DynFuture<'a, usize>;
}
impl MyAsyncTrait for SocketReader {
    extern "C" fn read<'a>(&'a mut self, mut buffer: SliceMut<'a, [u8]>) -> DynFuture<'a, usize> {
        Box::new(
            async move {
                let slice = buffer.deref_mut();
                let read = SocketReader::read_async(&mut self.socket, slice).await;
                buffer = slice.into();
                read
            }
        ).into()
    }
}

Incremental stability

stabby also lets you tell it that something is ABI-stable even if you couldn't chain #[stabby::stabby] all along using stabby::abi::StableLike.

Combined with the ZSTs in stabby::compiler_version that implement stabby::IStable however you tell them to, but only when compiled with their respective versions of the compiler, this lets you state that some types are only stable if compiled with the appropriate compiler version. But the ZSTs will still exist even if not, so the types will still be usable anywhere that doesn't have a stabby::IStable bound.

The stabby "manifesto"

stabby was built in response to the lack of ABI-stability in the Rust ecosystem, which makes writing plugins and other dynamic linkage based programs painful. Currently, Rust's only stable ABI is the C ABI, which has no concept of sum-types, let alone niche exploitation.

However, our experience in software engineering has shown that type-size matters a lot to performance, and that sum-types should therefore be encoded in the least space-occupying manner.

My hope with stabby comes in two flavors:

stabby's SemVer policy

Stabby includes all of its stabby_abi::IStable implementation in its public API: any change to an IStable type's memory representation is a breaking change which will lead to a MAJOR version change.

From 6.1.1 onwards, Stabby follows SemVer Prime, using the api, abi as the key. Here's a few ways you can interpret that: