Closed vmx closed 4 years ago
It looks like some of the fancy stuff I had to do to reduce the circuit complexity of Poseidon added some runtime overhead in the form of synthesis time.
My first try at a fix yields this:
2020-02-28T21:58:36.745 21378 high INFO bellperson::groth16::prover > Bellperson 0.6.2 is being used!
2020-02-28T21:58:39.111 21378 high DEBUG bellperson::gpu::locks > Acquiring priority lock...
This is a 'gap' of ~2.4 seconds. I don't have a previous timing for my dev machine, but if we assume your numbers apply, then this cuts the regression about in half, from 4 down to 2 seconds.
I can do some more things of this nature and will iterate a bit.
And, here we go:
2020-02-29T11:31:58.708 33990 high INFO bellperson::groth16::prover > Bellperson 0.6.2 is being used!
2020-02-29T11:32:00.238 33990 high DEBUG bellperson::gpu::locks > Acquiring priority lock...
If I follow, this gets us back down to the expected gap.
For tracking purposes, this will involve:
After some more testing, it looks like the happy result above was a bit too optimistic. That set of changes also caused a regression from the original correctness work (for Poseidon) leading to the performance regression. I think I now have something which captures the consistent intersection of these new performance improvements while retaining previous Poseidon correctness fixes.
2020-02-29T15:19:13.660 42956 high INFO bellperson::groth16::prover > Bellperson 0.6.2 is being used!
2020-02-29T15:19:15.200 42956 high DEBUG bellperson::gpu::locks > Acquiring priority lock...
NOTE: I updated this because I used the wrong end event the first time.
This makes the gap 1.5s, which — given the above context — is probably a good place to proceed with those changes. If we decide further work needs to happen for this issue, it can come in a next round.
Also, bear in mind that any increase in synthesis time is traded against much fewer constraints, which should translate into faster proof-generation. It would be good to get some numbers on the total cost before/after.
If "proof-generation" is the groth_proof_time
then it went down significantly. from ~51ms to ~13ms on my machine (though we are taking about ms here).
FYI: I just reran things again with the newest versions (I updated my Cargo.lock), now the gap is even bigger on that machine, ~6.2s (it was ~4.5s).
It's true, synthesis was slow! But this was fixed in #1174, and Window PoSt is now reported to perform much faster in practice.
Description
There is a performance regression between master HEAD (https://github.com/filecoin-project/rust-fil-proofs/commit/0568667c442a6a54707f5d15ab78c690d7f91661) and https://github.com/filecoin-project/rust-fil-proofs/commit/7086c03e519d069d3926a14056c468e9f66da8b2). On master the PoSt proofs take longer.
When turning on logging, I identified a new "gap" where the regression seems to be:
older version:
master HEAD (with regression):
Between the last two log lines there is only a ~0.5s in the old version, on master it's ~4.5s.
In order to reproduce this simply run
Acceptance criteria
The gap is as small as with the old version.
Risks + pitfalls
Where to begin