Specialize `Positions::[r]fold`

rust-itertools / itertools

Extra iterator adaptors, iterator methods, free functions, and macros.

https://docs.rs/itertools/

Apache License 2.0

2.72k stars 309 forks source link

Specialize `Positions::[r]fold` #813

Closed Philippe-Cholet closed 10 months ago

Philippe-Cholet commented 10 months ago

Related to #755

First, I have to say that after adding many benchmarks in #806, building specialization benchmarks is now slower: 1mn53s hence more than 4mn to run this benchmarking command twice:

cargo bench --bench specializations "positions/r?fold"

positions/fold          time:   [908.94 ns 911.83 ns 914.93 ns]
positions/fold          time:   [440.01 ns 442.34 ns 445.05 ns]
                        change: [-52.042% -51.081% -49.844%]

positions/rfold         time:   [580.24 ns 581.23 ns 582.15 ns]
positions/rfold         time:   [612.87 ns 614.28 ns 615.81 ns]
                        change: [+5.5346% +6.7070% +7.8680%]

Benchmarks all rely on core::slice::Iter which does not specialize rfold but I did not thought it would be slower. It's not by much but still. What do you think?

Philippe-Cholet commented 10 months ago

@phimuemue After changing fold (to use enumerate below), I re-ran the benchmark and roughly got 500 ns (so slower). My guess is that it's because + count is done much more often and not once. (The +1 operations are only moved to enumerate.)

    let count = self.count;
    let mut f = self.f;
    self.iter.enumerate().fold(init, |mut acc, (idx, val)| {
        if f(val) {
            acc = func(acc, idx + count);
        }
        acc
    })

A very small loss for rfold.

phimuemue commented 10 months ago

@phimuemue After changing fold (to use enumerate below), I re-ran the benchmark and roughly got 500 ns (so slower). My guess is that it's because + count is done much more often and not once. (The +1 operations are only moved to enumerate.)
    let count = self.count;
    let mut f = self.f;
    self.iter.enumerate().fold(init, |mut acc, (idx, val)| {
        if f(val) {
            acc = func(acc, idx + count);
        }
        acc
    })
A very small loss for rfold.

I meant replacing iter/count in Positions entirely, not just in [r]fold. So, I suggest merging this PR, and afterwards replacing Positions's fields.