rust-itertools / itertools

Extra iterator adaptors, iterator methods, free functions, and macros.
https://docs.rs/itertools/
Apache License 2.0
2.72k stars 309 forks source link

Feature request: `.enumerate(n)` where enumeration starts with index `n` instead of 0 #815

Open fritzrehde opened 10 months ago

fritzrehde commented 10 months ago

I am currently solving Advent of Code day 07 part 1, and wrote this code:

    let total_winnings: usize = puzzle_input
        .lines()
        .map(|line| line.parse::<HandWithBid>().unwrap())
        .sorted()
        // Give each hand a rank.
        .enumerate()
        // The rank starts at 1, not 0.
        .map(|(i, hand)| (i + 1, hand))
        .map(|(rank, hand)| hand.bid * rank)
        .sum();

I needed to give each iterator element a rank, starting at 1. Therefore, the only hacky solution I came up with was to .enumerate() as normal, and then increment each index i to get the rank. I think the code would a little cleaner/idiomatic if I could write:

    let total_winnings: usize = puzzle_input
        .lines()
        .map(|line| line.parse::<HandWithBid>().unwrap())
        .sorted()
        // Give each hand a rank, starting at 1.
        .enumerate(1)
        .map(|(rank, hand)| hand.bid * rank)
        .sum();

Would it be possible to implement something like that in itertools? Thanks for the awesome crate!

Philippe-Cholet commented 10 months ago

I'm doing AoC as well and I sure do +1 as much as anybody. But do that would clash with core::iter::Iterator::enumerate which we won't do here. Obviously, we could change the name to enumerate1 (just an example). It's nice to have shortcuts for AoC but this is a such trivial shortcut that I don't think we would do it.

But what you can do is your own super-trait of Iterator in your utilities, something like

trait IteratorExt: Iterator {
    fn enumerate1(self) -> std::iter::Map<std::iter::Enumerate<Self>, fn((usize, Self::Item)) -> (usize, Self::Item)>
    where
        Self: Sized,
    {
        self.enumerate().map(|(idx, item)| (idx + 1, item))
    }
    // EDIT: There is zip I forgot:
    fn enumerate_n(self, n: usize) -> std::iter::Zip<std::ops::RangeFrom<usize>, Self>
    where
        Self: Sized,
    {
        (n..).zip(self)
    }
}

impl<I: Iterator> IteratorExt for I {}

Then use my_utils::IteratorExt; for your solvers.

~Make a method enumerate_n(n: usize) does not seem to be that simple though.~

EDIT2: The other alternative is simply .zip(1..) (but the tuple argument is reversed) instead of a custom enumerate_n(1).

jswrenn commented 10 months ago

This is a neat idea! You can even make it a little more generic:

use core::ops::RangeFrom;

pub fn itemize<I, A>(iter: I, mut start: RangeFrom<A>) -> impl Iterator<Item = (A, I::Item)>
where
    I: Iterator,
    A: 'static,
    RangeFrom<A>: Iterator<Item = A>,
{
    iter.map(move |e| (start.next().unwrap(), e))
}

...allowing calls like itemize(iter, 'a'..). It's too bad the Step trait isn't stable — that would allow even more flexibility.

Philippe-Cholet commented 10 months ago

That's like start.zip(self) (I therefore edited my previous comment).

The downside of your function is that we can't put it in a trait because of the opaque type of the function.

jswrenn commented 10 months ago

I'm going to go ahead and add this to itertools. The RangeFrom formulation is compellingly powerful, and the Zip return type means it has almost no maintenance burden.

Philippe-Cholet commented 10 months ago

I'm not sure but I feel like a "zip" version would be less efficient than .enumerate.map?

fritzrehde commented 10 months ago

I'm doing AoC as well and I sure do +1 as much as anybody. But do that would clash with core::iter::Iterator::enumerate which we won't do here. Obviously, we could change the name to enumerate1 (just an example). It's nice to have shortcuts for AoC but this is a such trivial shortcut that I don't think we would do it.

But what you can do is your own super-trait of Iterator in your utilities, something like

trait IteratorExt: Iterator {
    fn enumerate1(self) -> std::iter::Map<std::iter::Enumerate<Self>, fn((usize, Self::Item)) -> (usize, Self::Item)>
    where
        Self: Sized,
    {
        self.enumerate().map(|(idx, item)| (idx + 1, item))
    }
    // EDIT: There is zip I forgot:
    fn enumerate_n(self, n: usize) -> std::iter::Zip<std::ops::RangeFrom<usize>, Self>
    where
        Self: Sized,
    {
        (n..).zip(self)
    }
}

impl<I: Iterator> IteratorExt for I {}

Then use my_utils::IteratorExt; for your solvers.

~Make a method enumerate_n(n: usize) does not seem to be that simple though.~

EDIT2: The other alternative is simply .zip(1..) (but the tuple argument is reversed) instead of a custom enumerate_n(1).

Just to clarify, obviously it shouldn't be named .enumerate() to conflict with the std lib implementation. I thought of .enumerate_from(n), .enumerate_from_index(n), .enumerate_starting_from_index(n). But not sure what kind of preferences you have on verbosity of method names.

fritzrehde commented 10 months ago

I looked at Enumerate<I> itself in the std lib, and it's a simple:

pub struct Enumerate<I> {
    iter: I,
    count: usize,
}
impl<I> Enumerate<I> {
    pub(in crate::iter) fn new(iter: I) -> Enumerate<I> {
        Enumerate { iter, count: 0 }
    }
}

So implementation there would obviously be trivial, just allow starting from count 1. I am also kind of proposing adding this feature in the first place because of very slight efficiency concerns. Of course:

        .enumerate()
        // The rank starts at 1, not 0.
        .map(|(i, hand)| (i + 1, hand))

is just a simple addition, which should't cost much performance, but it makes it a tiny bit less readable and also probably is a tiny bit slower. I don't really see any downsides to adding this to itertools (or stdlib, but that probably wouldn't happen, right?), since implementation should be fairly straightforward and maintainable, right?

fritzrehde commented 10 months ago

Do you think I should make this feature request to the official rust std lib instead? Like detailed in my last message, I think it would be very easy to implement this in the std lib, and it would have zero overhead (no extra zip or map).

Philippe-Cholet commented 10 months ago

Well they added enumerate and some of them probably thought of an alternative start than 0 so I suspect they would not be interested, but it's only a guess.

Coming back to this, I see that (n..).zip(self) would not be ExactSizeIterator even self is because n.. is not. ~But it would with n..=usize::MAX, except it might be shorter than self though in some cases.~ (EDIT: too fast, no such implementation) It's so much simpler when n == 0.