tokio-rs / tokio-uring

An io_uring backed runtime for Rust
MIT License
1.14k stars 122 forks source link

[buf/fixed] FixedBufAllocator implementation #219

Closed hellertime closed 1 year ago

hellertime commented 1 year ago

This PR introduces the concept of a FixedBufAllocator.

I wanted to provide a third option for obtaining FixedBuf from the tokio-uring runtime. While both the FixedBufRegistry and the FixedBufPool are excellent choices when it is known precisely what sized buffers you will need upfront. In times when that isn't well known, falling back on dynamic memory management is a good option.

This PR makes some modifications to the internals of the FixedBuf so that the drop behavior makes sense wether we need to "check-in" the buffer or "free" it.

There is also a test case included to exercise the allocator.

One limitation of the allocator approach at the moment is we cannot grow a heap beyond a single registered iovec. There may be a way around this limitation, if we can ensure allocations never cross certain byte addresses (if thats possible we can just map multiple iovec to contiguous ranges on the heap), but for now heaps are limited to ~1GiB (the allocator itself consumes a bit of the heap for alignment and bookkeeping).

hellertime commented 1 year ago

It appears the CI environment has a more restrictive RLIMIT_MEMLOCK than my development host. Is it possible to see what the current limit is/raise that limit somehow?

FrankReh commented 1 year ago

Thank you for providing a series of commits to break up the functionality changes into logic units. Should make it easier to understand but have to admit, I don't have the powers of concentration at the moment to understand the implications of the first commit. I understand it makes the API more general without adding anything yet - which is great.

So will it make it general in the way that @mzabaluev and @ollie-etl and I have always hoped? So other registries or pools can be dropped in? Would it allow the second commit to done in an outside crate?

And is this entire PR a core piece that should come with fixed buffers or is this now the third incarnation that could or should have been put into an extras crate. I felt like we should have an extras crate even for the first fixed buffer registry, at least that's what I recall, and with every review and prolonged discussion, I wish more and more that's what we had done.

It's possible the first commit is extending the API in a core way that is useful and the commit that wants to pull in a new allocation crate is a prime example of an opinion that many might share, but that wouldn't make it into a tokio sponsored repo. As I've said at other times, a new crate being introduced to tokio-uring probably won't fly if it isn't already used by tokio. No one more central to tokio corrected that line of reasoning (maybe didn't see it).

And while I want to get PRs committed as soon as possible when they don't have conflicts, we already have two PRs centered on fixed buffers that need to be resolved first. Doesn't mean we shouldn't review and comment now if possible.

Sorry if this came across as too blunt. It wasn't my intent but I know the text-only communication we have with this medium doesn't show my true interest and I didn't want to wait a whole day before responding. Contributions are valuable and the more people who get good at understanding things like the fixed buffer registry and pool there are, the better. So already it's impressive.

hellertime commented 1 year ago

Thank you for providing a series of commits to break up the functionality changes into logic units. Should make it easier to understand but have to admit, I don't have the powers of concentration at the moment to understand the implications of the first commit. I understand it makes the API more general without adding anything yet - which is great.

So will it make it general in the way that @mzabaluev and @ollie-etl and I have always hoped? So other registries or pools can be dropped in? Would it allow the second commit to done in an outside crate?

This is a great question! The commit does make it easier to develop other registries within the tokio-uring crate, it does not expose enough power to build a new registry on top of the tokio-uring crate. The issue comes down to the implementation of FixedBuf, its not possible for a user of tokio-uring to construct a new FixedBuf directly (as we must here in the allocator changes in the subsequent commit: https://github.com/tokio-rs/tokio-uring/pull/219/commits/2834ed59a9f50c96ba47bb0be1fccc8a4c5f4101#diff-e8d535e190483efc25b4b518284f1e99e0e20add5c0191f2ebb462638fffc9dcR139), as new() is not public, any registry needs to return FixedBuf to meet the call signature of any of the "fixed" calls.

I do actually have another implementation on a branch (https://github.com/hellertime/tokio-uring/tree/hellertime/buf-generic-fixed) that creates an entirely new pair of BoundedFixedBut/Mut types and associated IoBufFixed/Mut traits which expose the buf_index() needed for "fixed" calls.

However I prefer this implementation as it feels unclean to let users create types which expose a "buf_index" arbirarily.

It also is a much more intrusive patch-set, as every call that touches a fixed call is currently bound to BoundedBufMut<BufMut=FixedBuf>.

BUT! I would love to have a cleaner way to implement this outside of tokio-uring. I have no idea what that API shape would be just yet!

And is this entire PR a core piece that should come with fixed buffers or is this now the third incarnation that could or should have been put into an extras crate. I felt like we should have an extras crate even for the first fixed buffer registry, at least that's what I recall, and with every review and prolonged discussion, I wish more and more that's what we had done.

An extras crate for the registries makes sense only if there is at least one way to obtain FixedBuf in the default crate.

It's possible the first commit is extending the API in a core way that is useful and the commit that wants to pull in a new allocation crate is a prime example of an opinion that many might share, but that wouldn't make it into a tokio sponsored repo. As I've said at other times, a new crate being introduced to tokio-uring probably won't fly if it isn't already used by tokio. No one more central to tokio corrected that line of reasoning (maybe didn't see it).

I don't have a good grounding here, so can't add much. I do understand the desire for minimal dependencies. If we could find an approach to expose access to registered buffers in a way that is safe(ish) and clean, it would be nice to rip out all the registry/allocator logic.

And while I want to get PRs committed as soon as possible when they don't have conflicts, we already have two PRs centered on fixed buffers that need to be resolved first. Doesn't mean we shouldn't review and comment now if possible.

Mind linking these into this PR, it feels reasonable to consider these blocking changes, or at least question if they should be in this PR.

Sorry if this came across as too blunt. It wasn't my intent but I know the text-only communication we have with this medium doesn't show my true interest and I didn't want to wait a whole day before responding. Contributions are valuable and the more people who get good at understanding things like the fixed buffer registry and pool there are, the better. So already it's impressive.

No worries, again the initial question is a great one. I am eager to make use of this code, so I would be happy to see the PR land. But I can continue to leverage the FixedBufPool for now while I wait.

FrankReh commented 1 year ago

@hellertime @ollie-etl @mzabaluev Nibbling a little at a time ... are we in agreement that to allow external creation of fixed buffers, we have to define a trait that gives us the fixed buf index when we need it?

Somehow I thought the question could boil down to a general allocator and iterator for setup and drop for deallocation and the indexing would be the purview of the tokio_uring crate. But maybe allocation can't always be known up front?

There is the provided buffers feature of urings also that we should take into account. Does continuing to extend abilities with fixed buffers come at an unnecessary cost when simply starting to support provided buffers would be solving more with less? I seem to recall the provided buffers come with a group_id and there can be many groups of provided buffers, each with their own group_id.

I'm much more familiar with the buf_ring feature at the moment but they are best suited for the multi read operations which we will eventually get to I think. The earlier uring buffer options are a little complicated without going over the uring man pages again because those same man pages never try to lay out the pros and cons of each scheme. They just seem to fold new man pages along with new features as the uring device evolves.

I know fixed buffers was the first to be supported so would cover more kernels, but if we keep trying to support the earliest kernels, we will continue to make very slow progress.

hellertime commented 1 year ago

apologies. I unintentionally closed this PR.

rrichardson commented 1 year ago

There is the provided buffers feature of urings also that we should take into account. Does continuing to extend abilities with fixed buffers come at an unnecessary cost when simply starting to support provided buffers would be solving more with less?

It seems that going with provided buffers would allow the use of any old Buffer wrapper. It wouldn't know or care about an index. Using the IORING_REGISTER_PBUF_RING approach would, I think, not even require a buffer to know its buffer group ID. It'd just need to know how to recycle itself back into the allocator/pool whence it came, to be handed back to the ring.

That said, PBUF_RING makes this allocator scheme even more valuable, since it would allow us to add new buffers/groups at runtime.

hellertime commented 1 year ago

@hellertime @ollie-etl @mzabaluev Nibbling a little at a time ... are we in agreement that to allow external creation of fixed buffers, we have to define a trait that gives us the fixed buf index when we need it?

Somehow I thought the question could boil down to a general allocator and iterator for setup and drop for deallocation and the indexing would be the purview of the tokio_uring crate. But maybe allocation can't always be known up front?

There is the provided buffers feature of urings also that we should take into account. Does continuing to extend abilities with fixed buffers come at an unnecessary cost when simply starting to support provided buffers would be solving more with less? I seem to recall the provided buffers come with a group_id and there can be many groups of provided buffers, each with their own group_id.

I'm much more familiar with the buf_ring feature at the moment but they are best suited for the multi read operations which we will eventually get to I think. The earlier uring buffer options are a little complicated without going over the uring man pages again because those same man pages never try to lay out the pros and cons of each scheme. They just seem to fold new man pages along with new features as the uring device evolves.

I know fixed buffers was the first to be supported so would cover more kernels, but if we keep trying to support the earliest kernels, we will continue to make very slow progress.

If I understand the use case of both the Provided Buffers and PBUF_RING these are really useful for reads, but don't support writes. So until such support is added (if ever) I think the fixed buffers have a place in the tokio-uring ecosystem.

As for focusing on FixedBuf over the other implementations, I agree it shouldn't take up all the bandwidth.

As for the FixedBuf implementation itself, I'd rather not develop a trait that exposes the buf_index() as that feels like it opens up an avenue for abuse.

It would be interesting if we could split off a new FixedBuf from an existing FixedBuf.

For instance, if we could do the following:

use tokio_uring::buf::fixed::FixedBufRegistry
use buddy_alloc::buddy_alloc::BuddyAlloc;
use buddy_alloc::BuddyAllocParam;

let registry = FixedBufRegistry::new(vec![0u8;1<<20]);
tokio_uring::start(async{
    registry.register()?;
    let fixed_buf = registry.check_out(0).unwrap();
    let mut alloc = unsafe { BuddyAlloc::new(BuddyAllocParam::new(fixed_buf.stable_ptr(), fixed_buf.bytes_total(), 4096)) };
    let ptr = alloc.malloc(32);
    let fixed_buf2 = fixed_buf.split_fixed_buf().addr(ptr).len(32).dtor(|ptr| alloc.free(ptr)).build().unwrap();
    ...
    Ok(())
}

The key point here is the builder would only succeed if addr and len were contained within stable_ptr() and bytes_total(). The new FixedBuf would use the dtor on drop, and so it wouldn't "check in" to the registry that created it. It would be something akin to a Slice but not a different type. It would inherit the same index as the FixedBuf it was split from. And it would allow creating this allocator instance externally to the tokio-uring crate.

This idea is somewhat like the slice_ref method for Bytes: https://docs.rs/bytes/latest/bytes/struct.Bytes.html#method.slice_ref

I'm not even sure this requires the Registry to exist, but for now thats how we obtain the initial FixedBuf containing the entire iovec that was registered.

Even more of a hair-brained idea: if we unified on a ref-countable type like Bytes except with the addition of stable(_mut)_ptr (call it IoBytesMut). Imagine if register had a signature like fn register(bufs: Iter<Item=IoBytesMut>) -> Iter<Item=(BufIndex,IoBytesMut)>. And the "fixed" calls had two flavors fn write_at_fixed(idx: BufIndex, buf: IoBytesMut) and unsafe fn write_at_fixed_unchecked(idx: BufIndex, buf: IoBytesMut). The BufIndex can be used to validate that IoBytesMut has been registered (say it contains the base_ptr and len of the registered buffer), and it would work for any slice of that IoBytesMut. This sketch is a bit low level, and hand waves some things. But it would be quite nice if it was possible to decouple registration from the type which contains and slices the buffers.

oliverbunting commented 1 year ago

I don't have the bandwidth to comment currently, sorry

hellertime commented 1 year ago

I've been following what's been going on in https://github.com/tokio-rs/tokio-uring/issues/216, and my thought is it would be better to close this PR, and focus more on the cleanup of the buffer types used in tokio-uring.

It's my hope that in general the direction will be to expose enough flexibility in the buffer types to allow this type of allocator I have proposed here to be developed atop tokio-uring and not within.

I think it was @FrankReh who asked should something like this be in an extras crate and ultimately I think the answer is yes.

FrankReh commented 1 year ago

Thanks Chris. I appreciate the help in trying to keep the submissions, their order, and ourselves sane. Feel free to weigh in on other discussion and you can always open this again or even just ask to look more closely at that first commit again. I was going to do that later today in order to better understand the kind of generality you were looking for with it.

I don't expect it to happen, but I would pleasantly surprised if we can get all the uring enabled buffer types into a scheme that everyone is happy with that fits into the Completable async model people have talked about for a few years now. This crate suffers by being at the cross-roads in a few ways, one of them being can it ever be slipped into higher level functionality like hyper and also support all the special use cases for the io_uring device that people who want to work with tokio could dream up.