rust-itertools / itertools

Extra iterator adaptors, iterator methods, free functions, and macros.
https://docs.rs/itertools/
Apache License 2.0
2.62k stars 298 forks source link

Iterators generating (many) vectors (v2) #922

Open Philippe-Cholet opened 2 months ago

Philippe-Cholet commented 2 months ago

My #722 was rejected because it was not truly user-convenient nor truly performant but only in the middle.

Observation

But it can be more performant in some cases without sacrificing user convenience as I realized (while reviewing #914) that some iterator methods do not really need to give away all generated vectors:

Nightly (just to have an idea of what might come): one item (advance[_back]_by, try_find), two items (is_sorted, is_sorted_by).

(Except nth[_back],) all of those usually rely on fold/try_fold which requires to own every item.

Implementation idea

But we could implement a private lightweight try_fold that only take references to an internal item that we would carefully update step-by-step. Then we can easily enough specialize multiple iterator methods, like find that probably has the wider usage (e.g. used by .filter(..)).

After some experiments...

/*private*/ trait LightweightIterator: Iterator + Sized {
    // I'm not sure we need `B`: a lightweight `try_for_each` might be enough. But pass an argument `()` is free!
    fn lw_try_fold<B, F, E>(&mut self, init: B, f: F) -> LoopExit<Self::Item, B, E>
    // ^^^               ^                               ^^^^^^^^ ^^^^^^^^^^^^
    where F: FnMut(B, &Self::Item) -> Result<B, E>;
    //                ^               ^^^^^^^^^^^^

    // Then it would provide "lw_" versions of:
    // count last nth find (cmp partial_cmp eq?) max_by max_by_key min_by min_by_key!
}

// `try_fold` basically returns `Result<B, E>`. With an additional internal item `T`:
enum LoopExit<T, B, E> {
    /// Iteration has ended.
    End {
        last_item: Option<T>,
        accum: B,
    },
    /// The function failed.
    Early {
        failed_item: T,
        err: E,
    },
}

Our iterators returning vectors are Combinations, Powerset, CombinationsWithReplacement, Permutations and MultiProduct.

Usage example

impl<I> LightweightIterator for Combinations<I> where ... {
    fn lw_try_fold<B, F, E>(&mut self, init: B, f: F) -> LoopExit<Self::Item, B, E>
    where F: FnMut(B, &Self::Item) -> Result<B, E> { ... }
}

impl<I> Iterator for Combinations<I> where ... {
    fn find<P>(&mut self, predicate: P) -> Option<Self::Item>
    where
        P: FnMut(&Self::Item) -> bool,
    {
        self.lw_find(predicate) // easy
    }
    // then at least... (max|min)_by[_key]
}

Plan