rust-lang / futures-rs

Zero-cost asynchronous programming in Rust
https://rust-lang.github.io/futures-rs/
Apache License 2.0
5.4k stars 626 forks source link

Refactor: let `StreamExt::chunks` and `StreamExt::ready_chunks` take `NonZeroUsize` instead of `usize` #2648

Closed janriemer closed 2 years ago

janriemer commented 2 years ago

Hey folks,

Note: after writing all of this below, I noticed std::slice::chunks also uses usize and panics when provided a value that is 0, which is really unfortunate, IMO. Of course, we won't be able to change the std lib and we should keep symmetry between Streams and the std lib, but I'm still interested what you think of this.

Issue

I've noticed that the methods chunks and ready_chunks take a usize as an arg for the capacity. This seems to be a good arg type, because other "capacity" methods in the std lib take usize as well (e.g. Vec::with_capacity) and so there is symmetry between them.

However, there is a huge pitfall and it is actually not symmetric: chunks and ready_chunks will panic, when passed in a usize of 0 (according to docs: This method will panic if capacity is zero.).

The other capacity methods (e.g. on Vec in std) do not panic, when provided with the value 0 (which is sensible, because one can have a Vec that has no allocations).

Proposal

Instead of using usize in the argument type, we should rather use NonZeroUsize, because it guarantees a non-zero value. Note that internally we can still use usize (to make it easier to interact with other methods, such as len() etc.) - this proposed change will only affect the API itself.

Here is an exemplary playground to show the general idea:

use std::error::Error;
use core::num::NonZeroUsize;
fn main() -> Result<(), Box<dyn Error>> {

    let cap_untrusted = 1usize; // imagine that this value comes from "somewhere" and could be 0

    let my_chunk = MyChunk::with_capacity(cap_untrusted.try_into()?);

    println!("{my_chunk:#?}");

    Ok(())
}

#[derive(Debug)]
#[allow(dead_code)]
struct MyChunk {
    cap: usize, // internally, we are free to use usize, so that we have symmetry with e.g. `Vec`...
    //...
}

impl MyChunk {
    // ...but our API is more restrictive and won't accept a value that is 0
    pub fn with_capacity(cap: NonZeroUsize) -> Self {
        Self {
            cap: cap.get()
        }
    }
}

Related

The following issues might be related and should be considered, when implementing this proposal:

2492 especially this comment

BREAKING

This is a breaking change, because the public API changes.

- fn chunks(self, capacity: usize) -> Chunks<Self>
+ fn chunks(self, capacity: NonZeroUsize) -> Chunks<Self>

- fn ready_chunks(self, capacity: usize) -> ReadyChunks<Self>
+ fn ready_chunks(self, capacity: NonZeroUsize) -> ReadyChunks<Self>
taiki-e commented 2 years ago

See https://github.com/rust-lang/futures-rs/pull/1803, https://github.com/smol-rs/async-channel/issues/2, and https://github.com/tokio-rs/tokio/issues/2742. I don't think it is reasonable to do this unless a feature like https://github.com/rust-lang/rust/issues/69329 is implemented.

janriemer commented 2 years ago

Interesting, thank you for the related links. :heart: I didn't know NonZero types should not be used for APIs and is considered unidiomatic Rust.