rust-itertools / itertools

Extra iterator adaptors, iterator methods, free functions, and macros.
https://docs.rs/itertools/
Apache License 2.0
2.72k stars 309 forks source link

Implement custom fold for `WhileSome` #780

Closed Owen-CH-Leung closed 12 months ago

Owen-CH-Leung commented 1 year ago

755

This PR introduces a custom fold method for WhileSome. The optimization hinges on the underlying iterator's fold:

With Custom fold: If the underlying iterator provides its own specialized fold, the performance gain depends on its implementation. Without Custom fold: In the absence of a specialized fold for the underlying iterator, the performance remains on par with the default.

Benchmark Results:

Default fold: time: [497.09 ns 497.45 ns 497.82 ns] Custom fold: time: [456.09 ns 458.15 ns 460.80 ns]

Philippe-Cholet commented 1 year ago

Untested, I think the bench part should be

let result = data.fold(String::new(), |mut acc, ch| {
    acc.push(ch);
    acc
});
assert_eq!(result, "0123456789abcdef"); // or result.as_str() if needed
Owen-CH-Leung commented 1 year ago

Untested, I think the bench part should be

let result = data.fold(String::new(), |mut acc, ch| {
    acc.push(ch);
    acc
});
assert_eq!(result, "0123456789abcdef"); // or result.as_str() if needed

Interestingly, when I re-benchmark again using the revised benchmark (using both fold / try_fold), the performance gain disappears.

default: time: [59.414 ns 59.554 ns 59.697 ns] overriden: time: [84.891 ns 85.273 ns 85.691 ns]

Seems that the overhead from calling to_string() is far more heavier than just push(ch)

Philippe-Cholet commented 1 year ago

Each time in fold, to_string here creates a non-empty string and therefore heap-allocates (slow) while push(ch) merely mutates the existing acc string, so yes way heavier and the reason behind my quick comment, I should have said something. The assert_eq was probably a bit slow too for allocating to a vector while my suggestion would not.

Philippe-Cholet commented 12 months ago

I guess I can't merge this because an owner (jswrenn) requested changes.