Closed rbruenig closed 3 months ago
From what I understand myself, the compiler can better optimize heavy things (such as |a, b| format!("{a}{b}")
) than with reduce
where things are done in strict order. ~EDIT~
His author @scottmcm will have a better explanation than me though.
If parallelization is the main benefit, we should probably update the documentation, as it currently states:
otherwise if f is non-trivial like format!, you should use tree_reduce since it reduces the number of operations from O(n) to O(ln(n))
How does
Itertools::tree_reduce
achieveO(ln(n))
operations?
It definitely doesn't. It's a tree depth of O(lg(n))
, instead of a depth of O(n)
for normal reduce, but you're absolutely right that it's still n-1
calls to f
.
So yeah, sounds like it needs a documentation update.
The way you can get an algorithmic improvement is if the f
is itself O(n)
. For example, if f
is format!("{a}{b}")
, then reduce(f)
is O(n²)
but tree_reduce(f)
is O(n lg n)
.
Thanks for the answer!
Could you elaborate on the example with f
being format!
? Isn't there a different n
for tree_reduce
and format!
? For tree_reduce
the n
seems to be the number of elements that the iterator yields, for format!
the n
is either the number of arguments (which would be the same between reduce
and tree_reduce
) or the length of the given strings / the length of the result string.
The length of the arguments is different between reduce
and tree_reduce
, since for reduce
one argument (the "accumulator") would constantly grow while the other one is always just the next element from the iterator. For tree_reduce
both arguments grow. In the fist level of the tree all arguments are elements from the iterator, then results from the previous level and so on. For format!("f({a}, {b})")
the result is different: f(f(f(a, b), c), d)
vs f(f(a, b), f(c, d))
. But the length of both results are the same, so the sum of all n
given to format should be the same. Both variants do the same number of allocations (depends only on number of calls to f
) and allocate the same number of overall bytes.
EDIT (summarizing):
If both reduce(f)
and tree_reduce(f)
call f
n - 1
times and given f
is format!
, how does tree_reduce(f)
get O(n lg n)
vs O(n²)
for reduce(f)
?
Because the sizes of the intermediate values are different.
Here's a demonstration: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=61f468dfab44cb92983d8c87f55702e5
let mut total_capacity = 0;
let f = |a: String, b: String| {
let s = format!("<{a} {b}>");
total_capacity += s.capacity();
s
};
let r = (0..100).map(|x| x.to_string())
.reduce(f)
//.tree_fold1(f)
.unwrap();
dbg!(total_capacity, r.len());
// With reduce:
// [src/main.rs:15:5] total_capacity = 47220
// [src/main.rs:15:5] r.len() = 487
// With tree_fold1:
// [src/main.rs:15:5] total_capacity = 5310
// [src/main.rs:15:5] r.len() = 487
The total number of calls to f
are the same either way, as is the size of the output, but the different order for tree_reduce
means it's concatenating smaller strings on average.
Thanks, that makes sense! That example would be super helpful in the documentation. If you'd like I could prepare a PR for updating the doc.
Please do! PRs always welcome 🙂
@rbruenig Do you want to make a PR?
@Philippe-Cholet I'll prepare one this week. Didn't get around to it last week.
@Philippe-Cholet @scottmcm I provided a PR to update the documentation. Please let me know if it needs to be adapted 🙂
How does
Itertools::tree_reduce
achieveO(ln(n))
operations?If we take the visualization from the documentation:
This looks like a full binary tree with
n
being the number of leaf nodes and the operationsf
being the internal nodes. A binary tree withI
internal nodes hasL = I + 1
leaf nodes (if I'm not mistaken). That would mean the number of operations is alwaysn - 1
, which would beO(n)
and the same as the normalIterator::reduce
operation.What am I missing?