mimblewimble / grin

Minimal implementation of the Mimblewimble protocol.
https://grin.mw/
Apache License 2.0
5.04k stars 990 forks source link

[DNM] Fix verifier cache keys by hashing full `Output` #3613

Closed trevyn closed 3 years ago

trevyn commented 3 years ago

Per #3606 and discussion with @antiochp, this includes:

Outstanding questions:

trevyn commented 3 years ago

Thanks @DavidBurkett ! All comments addressed.

antiochp commented 3 years ago

It would be good to add a test here that exercises the full block processing pipeline resulting in an InvalidBlockProof. Something along the lines of the test in nrd_validation_rules.rs

Ideally we can reproduce the inflation bug running the test with 5.0.1 codebase. Then confirm the inflation bug is no longer present in 5.0.4. And finally that reintroducing the verifier cache logic does not also reintroduce the inflation bug.

We will have to jump through some hoops to build an "invalid" block (presumably we need to swap out a valid rangeproof for an invalid one). This is worth spending some time on as we definitely have less test coverage (and less ability to add them) around "invalid blocks" like this.

tl;dr This was probably the most serious bug we have discovered in mainnet and its worth the additional effort and we should take the opportunity to improve our testing of the block processing pipeline, specifically around our ability to setup tests with invalid data.

trevyn commented 3 years ago

It would be good to add a test here that exercises the full block processing pipeline resulting in an InvalidBlockProof. Something along the lines of the test in nrd_validation_rules.rs

Got it. I looked the relevant code and I think I'll be able to make progress on this. Will ping when ready for a review, or with any questions that arise. Thanks!

antiochp commented 3 years ago

We should pull the test from #3621 into this PR and consolidate the work. @snape479 @trevyn

antiochp commented 3 years ago

The more I look at the "verifier cache" the less I like our current implementation. I'm leaning toward simply getting rid of it entirely. There's a lot of cloning going on internally as it was written with a poor understanding of iterators and allocations in rust (I'm allowed to saw this as I wrote the code...)

Note: the test coverage implemented by @trevyn is still very much useful and I am not proposing we discard this.

I'm going to take a pass at what the code would look like with the verifier cache removed entirely. Not saying we should definitely remove it at this point - lets see what it would involve before making that decision.

If we subsequently want to reintroduce some level of caching we should do this at the txpool level and keep it isolated there, rather than threading this global "cache" state all over the codebase.

trevyn commented 3 years ago

I definitely support removing the cache. Do we have a sense of the actual, measured performance impact?

antiochp commented 3 years ago

I definitely support removing the cache. Do we have a sense of the actual, measured performance impact?

For a handful of txs per block like we have currently and a txpool that handles those few txs the performance impact is going to be close to zero. The verifier cache was originally put in place to handle future increased tx volume.

antiochp commented 3 years ago

PR to remove the verifier cache - https://github.com/mimblewimble/grin/pull/3628

trevyn commented 3 years ago

PR to remove the verifier cache - #3628

👍🏼. Will sort out bringing the tests over once that is merged.

trevyn commented 3 years ago

Closing in favor of https://github.com/mimblewimble/grin/pull/3628.

The only new test in this PR required direct access to the verifier cache, which no longer exists. Tests from https://github.com/mimblewimble/grin/pull/3621 should still work, I'll look at porting them over.