API: Fix unintuitive / broken `set_init()` behaviour

ollie-etl commented 1 year ago

The original discussion came up here https://github.com/tokio-rs/tokio-uring/pull/213#discussion_r1084501623 I'll rephase my arguments here:

Having bytes_init() only ever go up is broken. Consider the following example for copying data from a socket into a file

let mut buf = Vec::with_capacity(2048);
loop {
    b = socket.recv(buf).await.1;
    let _ = target.write_at(b,0).await;
}

This example is broken, because the set_init function is a one way ratchet, which can never be unset. It is called in recv, and all other read-like apis which fill buffers, to set the number of bytes read. However, set_init, in the current implmentation, ignores this value if its less than the previous number of initialised bytes. Unfortunately, large parts of write-like functions in the current API (and me) assume that bytes_init is equivalent to valid bytes of interest. As it stands IoBuf has no concept of length, only of bytes which have historically been valid (historical max len).

It is never possible to call a write-like function with a reused buffer. The IoBuf does not carry information on how many bytes to write.

This PR gets rid of the lifetime initialization counter, and replaces it with a the meaning of currently initialized bytes. I.e, after a read, the number of initialized bytes is the number read.

ollie-etl commented 1 year ago

@FrankReh @carllerche @Noah-Kennedy @mzabaluev . Many opinions wanted on this

mzabaluev commented 1 year ago

You can get the number of bytes that are actually read from the return value of the recv operation, and slice the data of the buffer with it. See the examples for how the APIs should be used. You should not ignore the number of bytes returned by write_at, either, because it can be less than the number of bytes in the input buffer/slice. I agree that this API is not intuitive, but this is what we have to account for two separate concerns:

Tracking the length of initialized bytes in the buffer. There is no reason to ever mark bytes as uninitialized after they've been initialized, and the only purpose of this water mark is to prevent safe access into uninitialized data.
Tracking the length of the data filled by the last read operation.

See ReadBuf in tokio as an API that keeps track of both.

FrankReh commented 1 year ago

This is a good issue to track. I didn't think a PR that only proposed changing a few files in tokio-uring does the question justice though.

And maybe this is one of those questions where everyone will have a different opinion from everyone else so since there is no BFDL, the status quo wins. My own feeling is the `tokio-uring` crate shouldn't be tracking an init field at all with IoBuf, to optimize for the case when it is being used as a long-lived server. When someone choses to use io_urings, they are picking a technology that will improve throughput when the system is busy at the cost of latency of single operations when the system is not busy. The overhead of tracking an init marker in each io buffer goes counter to that direction. When a buffer is going to be used many times, it has to be better for throughput if it were to be zeroed once and then reused with no init watermark using data space and code space. I believe we are here because Rust was designed to be fast so not to require initialization of things that weren't going to be used and now it's almost an accident that we still have that in this library. At least that's how I rationalize this current state. On the other hand, I'm used to it now so it doesn't bother me. What bothers me is not seeing an efficient roadmap of using this library in place of standard `tokio` in higher order crates like `hyper`. But that's another issue.

mzabaluev commented 1 year ago

When a buffer is going to be used many times, it has to be better for throughput if it were to be zeroed once and then reused with no init watermark using data space and code space.

I think it would make sense for fixed buffers at least. However, this is not what the IoBuf/IoBufMut abstraction provides in general. And there may be applications that do not reuse buffers, but allocate them from the heap once per each I/O operation. I feel that this case should be fully supported with no degradation in performance (assuming the allocator is not a bottleneck there).

mzabaluev commented 1 year ago

Also yes, there should be an issue to track this as an architectural change.

ollie-etl commented 1 year ago

Tracking the length of initialized bytes in the buffer. There is no reason to ever mark bytes as uninitialized after they've been initialized, and the only purpose of this water mark is to prevent safe access into uninitialized data.

I cannot think of a single implementation where unsafety would be caused by reverting the watermark to the current read level. This is because setting init lower than the true value cannot cause access to initialized bytes.

ollie-etl commented 1 year ago

And there may be applications that do not reuse buffers, but allocate them from the heap once per each I/O operation. I feel that this case should be fully supported with no degradation in performance (assuming the allocator is not a bottleneck there).

The change in API proposed would make no difference the single use case

ollie-etl commented 1 year ago

Moved to issue

FrankReh commented 1 year ago

I feel that this case should be fully supported with no degradation in performance (assuming the allocator is not a bottleneck there).

Fair enough. I know it's just my opinion. (I use that term a lot for this repo's issues.)

FrankReh commented 1 year ago

I cannot think of a single implementation where unsafety would be caused by reverting the watermark to the current read level.

Not brought up before but another use for reading into buffers is reading into buffer slices. Like reading into a Vec slice doesn't change the Vec's length, our reads don't need to change a property of the buffer itself. There is a return argument for the length. True other io calls can just return the length and don't have to return the buffer.

tokio-rs / tokio-uring

API: Fix unintuitive / broken `set_init()` behaviour #215