Open nfrisby opened 4 years ago
Summary of probable reasons: we don't run it much. More than 95% (conservatively) of the time the generators cause a test cause in which the forks are "shallow", and even so the tests wouldn't necessarily fail for "deep" forks because that's just a Praos phenomenon -- testing Praos is still a WIP.
My initial thoughts as of creating this Issue:
Has FakeVRF
been in use the whole time, or is that relatively recent?
RealTPraos
only runs 20 tests per invocation.
d is recip <$> choose (1, 10)
, so only 10% of the tests have less than ~10% round-robin.
k is elements [5, 10]
, whenever d
's intermittent round-robin didn't curtail them, the forks were still rarely deep enough.
Praos tests do not fail if the observed leader schedule renders consensus impossible; that's currently accepted as just "bad luck". Recall that I've been working on "testing Praos" in general for a while as permitted (eg HFC et al took priority). I just haven't gotten there yet, sadly. This is an example of something that needs to improve.
Though I'm having trouble even finding a case where this "accept as bad luck" scenario plays out for RealTPraos
-- it's rare with these generators.
master
. Both fail because Test.ThreadNet.Util.Expectations
assumes a k
deep fork is recoverable, but such a fork might not be recoverable due to the ChainSync k+1
st header (sometimes?) requiring a newer ledger state than we have from the intersection! I haven't updated that module since this became true of ChainSync (or was it always and we just hadn't noticed this edge case until recently?).I ran ~3500 RealTPraos
on master
before finding those. That would require ~175 executions of the 20-tests at a time RealTPraos
test. How many times has CI ran this (with the FakeVRF
)?
I still don't know if the input-output-hk/cardano-ledger-specs#1579 PR was relevant.
FakeVRF
is not recent, it's been used since before this testing work started.
See Issue input-output-hk/ouroboros-consensus#640 for context.