Closed Daniel-Aaron-Bloom closed 5 months ago
Broadly speaking I'm good with this. We should do some benchmarking though
Target branch changed to allow us to benchmark this more easily
@Daniel-Aaron-Bloom we benchmarked this earlier and this PR is slower.
ref branch: 3m43.452s test branch: 4m8.345s
What a shame. Someone should probably close #1034 then, as this is just an updated version of that.
@Daniel-Aaron-Bloom Sorry for missing out the updated conversation. Outside of the changes from #1034, I noticed you parallelized a loop in this PR, which I suspect may be hindering perfs and explaining part of the overhead (IIRC the initial version of this proposal in #1034 was yielding a couple percents improvement). Is there any particular reason why this extra change was done?
I'm not sure what "extra change" you're referring to. The original change parallelizes a single loop in reduce_polys_base
and modifies the call site of that function to support the required parallel iterator input.
My change is the same in that regard. The only difference is my change removes one call to collect
that the original change tried to remove but was unsuccessful.
I'd be happy to reintroduce that collect if it's removal is causing the performance dip.
I am talking about your additional par_iter
in plonky2/src/fri/oracle.rs
which wasn't part of the initial PR.
Ah, so then we are talking about the extra collect, as the mapped Vec has to be able to be parallelizable to be passed into the reduce
. Sure, I can remove that and see if collect peforms better.
Oh, but to answer your question directly, the reason was to incorporate this feedback.
Updated and improved version of #1034
Only requires a single
collect
of the powers.