rust-itertools / itertools

Extra iterator adaptors, iterator methods, free functions, and macros.
https://docs.rs/itertools/
Apache License 2.0
2.72k stars 309 forks source link

Collect result helper method #832

Closed JayJeyaruban closed 9 months ago

JayJeyaruban commented 9 months ago

I created a utility function .collect_res() as an alias to .collect::<Result<Vec<_>, _>>() the same way .collect_vec() is to .collect::<Vec<_>>(). I've found this fairly convenient and helpful.

If there's interest I could look to add it to itertools. My current implementation is on a standalone trait:

pub trait CollectResult: Iterator {
    fn collect_res<O, E>(self) -> Result<Vec<O>, E>
    where
        Self: Sized,
        Self: Iterator<Item = Result<O, E>>,
    {
        self.collect::<Result<Vec<O>, E>>()
    }
}
impl<I> CollectResult for I where I: Iterator {}

Think it should be a pretty straight forward change.

Small test for example use and how it compares to standard .collect and .try_collect.

    #[test]
    fn test() {
        fn inner() -> anyhow::Result<u32> {
            let safe_numbers = (1..20)
                .into_iter()
                .map(|i| match i {
                    1..=10 => Ok(i),
                    i => bail!("Number not okay: {i}"),
                })
                .map_ok(|i| i * 21)
                .collect_res()?;
            // .try_collect::<_, Vec<_>, _>()?
            // .collect::<Result<Vec<_>, _>>()?

            let summed = safe_numbers.iter().sum::<u32>();

            let min = safe_numbers.iter().min().context("unable to compute min")?;

            Ok(summed - min)
        }

        assert!(inner().is_err());
    }
phimuemue commented 9 months ago

Hi there. I somewhat know the urge to have convenience methods like this, but I think they do not bring that much benefit - in the end, users must learn another idiom that is not much better than collect::<Result<Vec<_>, _>.

On top of that, I find tying it to Vec too specific. If @jswrenn or @Philippe-Cholet wants to pick it up, fine. But I'd vote to close this until we have reliable evidence that lots of people actually want this.

Philippe-Cholet commented 9 months ago

self.collect() is enough inside collect_res as the function signature is enough information.

Personally, I have a ok_collect_vec and a bunch of other names (which are self.collect() with names and specific types) to solve Advent of Code out of pure convenience, but I usually convert them to try_collect later as I expect try_collect to be the goto function as I think it will be stabilized some day.

We have try_collect (in conflict with nightly) and collect_vec which is probably widely used, I'm unsure we should add yet another self.collect() shortcut. I agree with @phimuemue.

PS: There is also ok_it.process_results(|it| it.map(...).collect_vec()).

JayJeyaruban commented 9 months ago

@Philippe-Cholet thanks for your consideration.

A couple extra points vs try_collect and process_results. I've found that the inference of try_collect (as well as std collect) requires me to add the turbofish boilerplate consistently (similar motivation for why I like using collect_vec over collect) - hence the commented lines in my example test containing them.

I didn't look at the underlying implementation for process_results but I assume it doesn't consume the iterator so it's the better choice in those situations. I just subjectively like reaching for collect_res instead since frequently consuming the iterator isn't a concern and it reduces nesting :).

I understand the hesitation given the myriad of ways collecting can be performed already though and thanks again for your consideration.

Philippe-Cholet commented 9 months ago

I surely don't like using turbofish notation when I can avoid it, but unless you want to call a method on it rather than assign it to a variable (it.collect_res()?.into_iter().sorted().... vs let name: Vec<_> = it.try_collect()?;), I found try_collect quite reasonable (and again there is the it.process_results(|i| i.sorted()....) alternative).