Audit for missing `fold` implementations.

jswrenn commented 1 year ago

Implementing fold for our iterators allows us to provide more performant internal iteration. We provide it in some cases, but not all. I'd like us to take three steps towards resolving this:

Take an inventory of whether our adapters provide fold.
Implement fold for missing adapters.
(Potentially!) Implement a clippy lint to prevent regressions.

Happy to provide mentorship for any of these items.

scottmcm commented 1 year ago

Note that the vast majority of the time, you should be overriding fold, not for_each.

The default for_each just calls fold with a trivial () accumulator, which is easily optimized out (since in the default Rust calling convention, it's not even passed).

Thus the only reason to override for_each in addition to fold is if you can optimize something by knowing that the accumulation is stateless, but I can't off-hand recall any case where I've seen that to be the case.

jswrenn commented 1 year ago

Whoops, yes, of course. I'll modify the issue accordingly.

Philippe-Cholet commented 1 year ago

TODO list to track fold specializations (ordered by filepaths and line numbers)

[x] CoalesceBy
[x] MapSpecialCase
[x] Interleave
[x] InterleaveShortest
[x] PutBack
[x] Product
[ ] TakeWhileRef: No test/benchmark yet.
[x] WhileSome
[x] TupleCombinations + Tuple1Combination + Tuple*Combination (macro)
[x] FilterOk
[x] FilterMapOk
[x] Positions
[x] Update
[ ] MultiProduct: Might not be faster because of the heap-allocation bottleneck.
[ ] Combinations: Might not be faster because of the heap-allocation bottleneck.
[ ] CombinationsWithReplacement: Might not be faster because of the heap-allocation bottleneck.
[x] ConsTuples
[ ] DuplicatesBy #921
[x] ExactlyOneError
[x] FlattenOk
[ ] Groups: No test/benchmark yet.
[ ] Group: No test/benchmark yet.
[ ] Chunks: No test/benchmark yet.
[ ] Chunk: No test/benchmark yet.
[x] IntersperseWith
[ ] KMergeBy
[x] MergeBy
[x] MultiPeek
[x] PadUsing
[ ] PeekingTakeWhile: No test/benchmark yet.
[x] PeekNth
[ ] Permutations: Might not be faster because of the heap-allocation bottleneck.
[x] Powerset
[x] ProcessResults
[x] PutBackN
[ ] RcIter: No test/benchmark yet.
[x] RepeatN
[x] TakeWhileInclusive
[ ] Tee: No test/benchmark yet.
[ ] TupleBuffer
[ ] Tuples
[ ] TupleWindows
[ ] CircularTupleWindows
[ ] UniqueBy #918
[ ] Unique #919
[x] WithPosition
[ ] Zip
[x] ZipLongest

Should be ignored:

deprecated: Unfold
would not make sense: Iterate (infinite)!
Default implementation is optimal:
- Batching
- ZipEq (see #915)

And rfold:

[ ] MapSpecialCase
[x] Positions
[ ] Update
[ ] DuplicatesBy #921
[x] FlattenOk
[x] PadUsing
[ ] RcIter: No test/benchmark yet.
[x] RepeatN
[ ] UniqueBy #918
[ ] Unique #919
[ ] Zip
[x] ZipLongest

I think we should mention this issue in related PRs to help maintain this big todo list.

phimuemue commented 1 year ago

Once we do this, we should test all of them via test_specializations.

Owen-CH-Leung commented 1 year ago

I'm interested to contribute to this crate and I digged a bit into the source code. I think one potential candidate to override fold is Combinations.

https://github.com/rust-itertools/itertools/blob/master/src/combinations.rs#L100

What do you guys think ?

I can create a PR in coming few days to implement this.

phimuemue commented 1 year ago

I'm interested to contribute to this crate and I digged a bit into the source code. I think one potential candidate to override fold is Combinations.

https://github.com/rust-itertools/itertools/blob/master/src/combinations.rs#L100

What do you guys think ?

I can create a PR in coming few days to implement this.

@Philippe-Cholet I think you're the most knowledgeable there. Could you mentor/review this?

Philippe-Cholet commented 1 year ago

I previously had some thoughts about spliting next in two methods to make a nth that would create only one vector instead of n+1 vectors. And I thought earlier it would be better done before fold but no as it needs to create each item anyway. I guess fold would be an interesting and total rewrite of next. @Owen-CH-Leung Hi, sure you can help. And I'll be there to "mentor/review" it.

Philippe-Cholet commented 1 year ago

@jswrenn @phimuemue I've been editing a lot my comment above, and I would like to have feedback on it. The list currently consists of all structs implementing Iterator, maybe we should ignore some of them? Should we consider rfold specializations as well (small list above)?

DavidCoy77 commented 12 months ago

@jswrenn, I'd love to get started contributing to Rust, but I'd need a little help getting started since this would be my first ever GitHub contribution. You mention you're happy to provide mentorship in your initial post, so I'm eager to get to work if you could help me out with any of the items on the checklist. Thanks!

Philippe-Cholet commented 12 months ago

@DavidCoy77 I guess that specialize PutBackN::fold would be a good first pull request. You could look at #772 to find out what's expected.

jswrenn commented 12 months ago

Hi @DavidCoy77, apologies for not replying to your emails — I've been traveling. I agree with @Philippe-Cholet that our fold implementations are the best place to start.

DavidCoy77 commented 12 months ago

Great! Thank you for the help @jswrenn and @Philippe-Cholet. I've spent some time looking at the WithPosition::fold at #772 Philippe mentioned and unfortunately I have a hard time what would be asked of me for PutBackN::fold , so If you could offer additional guidance it would be greatly appreciated.

Philippe-Cholet commented 12 months ago

EDIT: The text below is a bit outdated as benchmarks and specializations are now written for most of our iterators. Write the fold method (and run a benchmark command before/after) is enough to contribute. Example: #820.

@DavidCoy77 I guess there is a lot of new things (I did not know criterion and quickcheck before coming here, and was not that familiar with fold either).

Are you familiar with Iterator::fold? With it, we consume the iterator and therefore consume all items. Specialize it can be faster than calling next repeatedly as it would be the case without specialization. You might not use fold directly but by default there are a few iterator methods that use fold such as for_each which I believe is more well known/used(?).
Are you familiar with itertools::PutBackN? I know I was not but I like the doctest example of its put_back method.
The core of the pull request would be to write PutBackN::fold.
In "benches/bench1.rs", we would write a benchmark function (using criterion) to measure the time improvement of specializing PutBackN::fold as we don't want to slow things down. Soon enough, this should be automatically provided by a macro I'm writing in another PR.
In "tests/specializations.rs", we write a function inside a "quickcheck" macro (it's property based testing, basically it tests our function against many inputs, and when it fails, it tries to find the smallest input that fails, super useful to find edge cases we did not consider properly). In that file, test_specializations takes a clonable iterator and test "fold" specialization amongst other method specializations, which simply allow us to write less code.

You can focus on figuring out PutBackN::fold first.

pub struct PutBackN<I: Iterator> {
    top: Vec<I::Item>,
    iter: I,
}
// From the next method, we want elements of `top` in the reversed order then, only then, the elements from `iter` (normal order).
fn next(&mut self) -> Option<Self::Item> {
    self.top.pop().or_else(|| self.iter.next())
}

EDIT: WithPosition::fold was a huge rewrite of WithPosition::next, not the simplest one. #774 might be a better example and also show we should delegate to fold when possible. I think PutBackN::fold will be an even better example once it's done.

Philippe-Cholet commented 8 months ago

@jswrenn I'm curious but don't understand the exact intention behind this yet:

(Potentially!) Implement a clippy lint to prevent regressions.

A lint that warn us about missing fold specializations, is that it?!

jswrenn commented 8 months ago

Yes, the idea is to add lints to clippy itself to warn us about missing fold specializations. For any adaptors where we explicitly didn't want to specialize fold, we would need to write #[allow(clippy::missing_iterator_fold)]. This would allow us to audit for missing clippy lints as simple as running cargo clippy, and to provide an automated warning if we ever tried to add a new adaptor without a fold implementation.

luander commented 8 months ago

I'm looking to give this issue a try. Started with Combinations and I wonder how does folding on that one would look like? with LazyBuffer and such. I can also use some mentoring/walkthrough of the crate if someone is willing to show me around.

Philippe-Cholet commented 8 months ago

@luander I can happily mentor you on this.

Specializations are tested in "tests/specializations.rs" so test it is a command line away (cargo test combinations is fast enough).
Specializations are benchmarked in "benches/specializations.rs" so benchmark it would be a command line away too (cargo bench --bench specializations combinations/fold to ignore others). But build this particular benchmark is really slow (almost 2 minutes for me) so the simplest thing to do is to temporarily ~~comment out~~ remove unrelated benchmarks (without commiting).
I'm not sure what it would look like exactly. I guess we might want to do some work from next while the lazy buffer is not full yet then some simpler work once we know for sure it is full. In the process, we might want to expand LazyBuffer abilities. Specialize Combinations::fold would definitely not be as straightforward as other iterators but it's an interesting one.

EDIT: Alternatively, ZipEq::fold would be simpler.

phimuemue commented 8 months ago

@jswrenn I'm curious but don't understand the exact intention behind this yet:

(Potentially!) Implement a clippy lint to prevent regressions.

A lint that warn us about missing fold specializations, is that it?!

If clippy does not accept the lint, we could maybe (cleverly ab)use missing_trait_methods.

Philippe-Cholet commented 8 months ago

If clippy does not accept the lint, we could maybe (cleverly ab)use missing_trait_methods.

I think it'll be accepted (and about size_hint, [try_][r]fold also). But probably not for all iterator methods we could want to specialize like count etc (that might be too much to ask). I ran cargo clippy -- -Wclippy::missing_trait_methods, got 6752 warnings. I added the #[warn(...)] to Powerset only and got only 73 warnings. If clippy could display JSON data instead of pretty text (or hack info from it), maybe we could leverage that though.

EDIT: I might keep my fork and have a branch implementing other lints if keep it updated is simple enough.

Philippe-Cholet commented 8 months ago

Update on the lint idea: After discussion on zulip, missing_trait_methods might accept a configuration (in clippy.toml) instead of a new lint that would be too specific to a single trait method when it could be generalized. It would not be as flexible as we might want but it would be decent. There are alternatives though: fork clippy (but maintainance?), use clippy sister project marker (great developper experience but it's a work in progress, I can't identify an iterator implementation with it yet), dylint (should be nice enough but my first try was a failure).

phimuemue commented 8 months ago

If clippy does not accept the lint, we could maybe (cleverly ab)use missing_trait_methods.

I think it'll be accepted (and about size_hint, [try_][r]fold also). But probably not for all iterator methods we could want to specialize like count etc (that might be too much to ask). I ran cargo clippy -- -Wclippy::missing_trait_methods, got 6752 warnings. I added the #[warn(...)] to Powerset only and got only 73 warnings. If clippy could display JSON data instead of pretty text (or hack info from it), maybe we could leverage that though.

EDIT: I might keep my fork and have a branch implementing other lints if keep it updated is simple enough.

My poor man's approach to fint missing folds (not thoroughly tested, but one gets the idea):

cargo clippy --message-format json -- --warn clippy::missing_trait_methods | jq .message.rendered | grep "\bfold\b"

Philippe-Cholet commented 8 months ago

@phimuemue Oh... I find embarrassing I was aware about --message-format json for "check" (never used) but not for "clippy".

After some tinkering on jq playground, I think I have a new git alias (it's not git-related but useful anyway):

missing-trait-methods = "!f() { cargo clippy --quiet --message-format json -- --warn clippy::missing_trait_methods | jq -r '. | select(.message.code.code == \"clippy::missing_trait_methods\") | select(.message.message | endswith(\"`'$2'`\")) | .message.spans[0] | .file_name + \":\" + (.line_start | tostring) + \"    \" + .text[0].text' | grep $1 | uniq; }; f"

then

git missing-trait-methods Iterator fold
git missing-trait-methods DoubleEndedIterator rfold

Output for `git missing-trait-methods Iterator size_hint`

``` src\adaptors\mod.rs:498 impl Iterator for Batching src\groupbylazy.rs:387 impl<'a, K, I, F> Iterator for Groups<'a, K, I, F> src\groupbylazy.rs:440 impl<'a, K, I, F> Iterator for Group<'a, K, I, F> src\groupbylazy.rs:556 impl<'a, I> Iterator for Chunks<'a, I> src\groupbylazy.rs:599 impl<'a, I> Iterator for Chunk<'a, I> src\grouping_map.rs:25 impl Iterator for MapForGrouping src\sources.rs:127 impl Iterator for Unfold ```

Philippe-Cholet commented 8 months ago

@luander Can I help you more on Combinations::fold? Alternatively, I suggested ZipEq::fold but if you want to work on Combinations, specialize nth would be appreciated (it would solve #301 and close the outdated and stuck #329).

Philippe-Cholet commented 7 months ago

@jswrenn A lint missing_iterator_fold was not particularly well received: too specific to iterators basically. The main alternative: clippy::missing_trait_methods could be configurable to limit warnings (6538 right now):

# clippy.toml
missing_trait_methods = [
    "core::iter::Iterator::fold",
    "core::iter::DoubleEndedIterator::rfold",
    "core::iter::Iterator::size_hint",
]

The downside is that allow the lint on a impl Iterator would allow not only one missing specialization but all of them. But if we limit ourselves to size_hint/fold/rfold then this seems good enough to me.

Is that okay with you? If it is, I'll work on making the lint configurable.

jswrenn commented 7 months ago

Making clippy::missing_trait_methods configurable sounds fantastic!

kinto-b commented 6 months ago

@Philippe-Cholet Hi there, on Combinations::fold, once #914 is merged, I think something like this should work as the core of an implementation:

// increment_indices returns true once we run out of combinations
while !self.increment_indices() {
  combination = self.indices.iter().map(|i| self.pool[*i].clone()).collect();
  acc = f(acc, combination)
}

If you think that looks reasonable I'll put through a PR.

Edit: I tried this out and, although it works, it is markedly slower than the unspecialized implementation

Philippe-Cholet commented 6 months ago

I tried this out and, although it works, it is markedly slower than the unspecialized implementation

@kinto-b We won't specialize it if it's slower (but that alone is a valuable information). About combinations, the slowest thing is heap-allocate the generated items. fold by nature generates them all (to give them to f) so we can't win anything on that side. The only thing Combinations::fold could do is simpler code after the initialization phase while the default fold repeatedly doing next have "init code" lying around without use after a few steps, basically what you did but even it was a perf win, I did not expect much out of it since the heap-allocation is the bottleneck here and nothing else really impact performance (or at least I think so). I kinda wonder if mutate a combination IN-PLACE and give clones would be faster but it's a lot of work for a probable small win (if any): while !self.increment_indices_and_comb(&mut comb) { acc = f(acc, comb.clone()); } or something like that.

I will eventually work again on configure clippy::missing_trait_methods and (if accepted) we will be able to mark that some fold are allowed to not be specialized and ensures all other fold are.

rust-itertools / itertools

Audit for missing `fold` implementations. #755