spacejam / rio

pure rust io_uring library, built on libc, thread & async friendly, misuse resistant
936 stars 45 forks source link
io io-uring linux rust steamy uring

rio

bindings for io_uring, the hottest thing to happen to linux IO in a long time.

Soundness status

rio aims to leverage Rust's compile-time checks to be misuse-resistant compared to io_uring interfaces in other languages, but users should beware that use-after-free bugs are still possible without unsafe when using rio. Completion borrows the buffers involved in a request, and its destructor blocks in order to delay the freeing of those buffers until the corresponding request has completed; but it is considered safe in Rust for an object's lifetime and borrows to end without its destructor running, and this can happen in various ways, including through std::mem::forget. Be careful not to let completions leak in this way, and if Rust's soundness guarantees are important to you, you may want to avoid this crate.

Innovations

This is intended to be the core of sled's writepath. It is built with a specific high-level application in mind: a high performance storage engine and replication system.

What's io_uring?

io_uring is the biggest thing to happen to the linux kernel in a very long time. It will change everything. Anything that uses epoll right now will be rewritten to use io_uring if it wants to stay relevant. It started as a way to do real async disk IO without needing to use O_DIRECT, but its scope has expanded and it will continue to support more and more kernel functionality over time due to its ability to batch large numbers different syscalls. In kernel 5.5 support is added for more networking operations like accept(2), sendmsg(2), and recvmsg(2). In 5.6 support is being added for recv(2) and send(2). io_uring has been measured to dramatically outperform epoll-based networking, with io_uring outperforming epoll-based setups more and more under heavier load. I started rio to gain an early deep understanding of this amazing new interface, so that I could use it ASAP and responsibly with sled.

io_uring unlocks the following kernel features:

To read more about io_uring, check out:

For some slides with interesting io_uring performance results, check out slides 43-53 of this presentation deck by Jens.

why not use those other Rust io_uring libraries?

examples that will be broken in the next day or two

async tcp echo server:

use std::{
    io::self,
    net::{TcpListener, TcpStream},
};

async fn proxy(ring: &rio::Rio, a: &TcpStream, b: &TcpStream) -> io::Result<()> {
    let buf = vec![0_u8; 512];
    loop {
        let read_bytes = ring.read_at(a, &buf, 0).await?;
        let buf = &buf[..read_bytes];
        ring.write_at(b, &buf, 0).await?;
    }
}

fn main() -> io::Result<()> {
    let ring = rio::new()?;
    let acceptor = TcpListener::bind("127.0.0.1:6666")?;

    extreme::run(async {
        // kernel 5.5 and later support TCP accept
        loop {
            let stream = ring.accept(&acceptor).await?;
            dbg!(proxy(&ring, &stream, &stream).await);
        }
    })
}

file reading:

let ring = rio::new().expect("create uring");
let file = std::fs::open("file").expect("openat");
let data: &mut [u8] = &mut [0; 66];
let completion = ring.read_at(&file, &mut data, at);

// if using threads
completion.wait()?;

// if using async
completion.await?

file writing:

let ring = rio::new().expect("create uring");
let file = std::fs::create("file").expect("openat");
let to_write: &[u8] = &[6; 66];
let completion = ring.write_at(&file, &to_write, at);

// if using threads
completion.wait()?;

// if using async
completion.await?

speedy O_DIRECT shi0t (try this at home / run the o_direct example)

use std::{
    fs::OpenOptions, io::Result,
    os::unix::fs::OpenOptionsExt,
};

const CHUNK_SIZE: u64 = 4096 * 256;

// `O_DIRECT` requires all reads and writes
// to be aligned to the block device's block
// size. 4096 might not be the best, or even
// a valid one, for yours!
#[repr(align(4096))]
struct Aligned([u8; CHUNK_SIZE as usize]);

fn main() -> Result<()> {
    // start the ring
    let ring = rio::new()?;

    // open output file, with `O_DIRECT` set
    let file = OpenOptions::new()
        .read(true)
        .write(true)
        .create(true)
        .truncate(true)
        .custom_flags(libc::O_DIRECT)
        .open("file")?;

    let out_buf = Aligned([42; CHUNK_SIZE as usize]);
    let out_slice: &[u8] = &out_buf.0;

    let in_buf = Aligned([42; CHUNK_SIZE as usize]);
    let in_slice: &[u8] = &in_buf.0;

    let mut completions = vec![];

    for i in 0..(10 * 1024) {
        let at = i * CHUNK_SIZE;

        // By setting the `Link` order,
        // we specify that the following
        // read should happen after this
        // write.
        let write = ring.write_at_ordered(
            &file,
            &out_slice,
            at,
            rio::Ordering::Link,
        );
        completions.push(write);

        let read = ring.read_at(&file, &in_slice, at);
        completions.push(read);
    }

    for completion in completions.into_iter() {
        completion.wait()?;
    }

    Ok(())
}