fjall-rs / fjall

🗻 LSM-based embeddable key-value storage engine written in safe Rust
https://fjall-rs.github.io/
Apache License 2.0
533 stars 23 forks source link

[feature] io-uring support #92

Closed jonahlund closed 5 days ago

jonahlund commented 5 days ago

I'm wondering if adding support for io-uring could be feasible without requiring any major rewrites, and if it's feasible, would there be enough benefits in doing so for async programs, instead of simply using spawn_blocking?

I could take a look at implementing something like this if it's deemed relevant enough.

marvin-j97 commented 5 days ago

First and foremost, it would have to implemented in lsm-tree (and value-log), fjall is just a wrapper around that. I'm not sure how feasible it is to rewrite everything into an async core, and if it really provides great performance improvements. And to benchmark, you have to actually migrate everything, which would be a lot of work. There are some benchmarks that don't look great: https://github.com/facebook/rocksdb/issues/11017

But, there are certain isolated operations that may be optimizable using io-uring without having to migrate the entire code base, specifically:

instead of simply using spawn_blocking

There are situations where you don't need to wrap everything in spawn_blocking either. A rough guide is to not do work between two await points for more than 10-100µs.

But, when you write with fsync (consumer SSD: ~1ms, HDD: ~15ms) or read from an HDD (~15ms), you definitely want spawn_blocking, because you will wait for so long, you could perform other meaningful work while waiting.

jonahlund commented 5 days ago

I see, thank you for the insightful response!

So I guess my initial thought was if async io-uring would be added to fjall (lsm-tree), it would probably be best kept as a separate optional feature rather than a core rework, kind of like how sled provides a complementary flush_async method for async io, providing better efficiency for async programs. But as you mentioned, the question is how beneficial is it to add true async io as opposed to use spawn_blocking.

marvin-j97 commented 5 days ago

kind of like how sled provides a complementary flush_async method for async io

Note that that function pretty much just moves the flush into a thread pool and awaits that:

https://github.com/spacejam/sled/blob/005c023ca94d424d8e630125e4c21320ed160031/src/tree.rs#L945

it would probably be best kept as a separate optional feature rather than a core rework

The problem is you would have to basically double the entire library to have the sync and async implementations separately. As of now I think async IO is best use sparingly in the situations I mentioned - I would love to see if it turns out that it increases performance there - but I find the io_uring API to not be too convenient to work with, so I haven't actually really put any time into it... https://github.com/fjall-rs/lsm-tree/issues/68