Closed ZenGround0 closed 3 years ago
Merging #1502 (6eca502) into master (a2369c5) will decrease coverage by
0.1%
. The diff coverage is29.1%
.
@@ Coverage Diff @@
## master #1502 +/- ##
========================================
- Coverage 71.5% 71.4% -0.2%
========================================
Files 72 72
Lines 8620 8634 +14
========================================
- Hits 6167 6165 -2
- Misses 1563 1572 +9
- Partials 890 897 +7
I'm a little unsure about this, only because I don't know how safe it is to absorb and ignore all of these cases.
The point of this PR is that it is safer than the alternative of aborting which will leave deadline cron in a bad state. This bad state is totally avoided at the expense of killing prove commit verifies using the approach in this PR. This is a good tradeoff. A few storage Providers burning gas in the event of an emergency is much better than a significant portion of system power being frozen out of cron.
Can we come up with a system to scream bloody murder if we do wind up erroring?
Yes all actors log rt.ERROR values should be escalated with high severity, they all represent dangerous programmer error.
While aborts from processBatchProofVerifies are all believed to be the result of programmer error or individual validator system error aborting from this code presents unnecessary risk if this is wrong. This PR changes these aborts to errors which are then logged and ignored from top level power cron.
The impact is that now in case of programmer error a batch of seal proofs will fail to validate and propagate power to storage providers instead of aborting power cron and either halting the chain or taking out 1/60th of sps from deadline cron.
In the worst case where programmer error is global all prove commits (not prove commit aggregates) would be dropped until programmer error fixed. In a more plausible state where programmer error is related to contents of the validation batch prove commits for one epoch would be dropped and then processing would continue as normal.
Reiterating that programmer error is not expected to exist but now defense against it is better.