IntersectMBO / ouroboros-consensus

Implementation of a Consensus Layer for the Ouroboros family of protocols
https://ouroboros-consensus.cardano.intersectmbo.org
Apache License 2.0
31 stars 22 forks source link

Remove artificial limitation in block production #668

Open edsko opened 4 years ago

edsko commented 4 years ago

Block production was using anachronisticLedgerView to request a ledger view for the leader check, and failing with TraceNoLedgerView if that failed to produce a ledger view. However, as part of the work on input-output-hk/ouroboros-network#1933 we realized using anachronisticLedgerView here is wrong, and we should use applyChainTick instead: there is no reason why we can't produce a block even if we are far ahead. I have however left in the call to anachronisticLedgerView (soon to be ledgerViewForecastAt), because the consensus tests are currently failing without it. I strongly suspect that this is because of a limitation of the test infrastructure (not quite setting parameters strictly enough) rather than a true bug. We should try to remove it and update the tests.

Note however that once the hard fork timing stuff is merged, we will re-introduce a restriction: if our ledger is far behind, we can't even convert the current UTCTime to a SlotNo, and so we can't do the leader check. That might in fact reintroduce the same limitation, and fix this problem without the current artificial check.

nfrisby commented 4 years ago

there is no reason why we can't produce a block even if we are far ahead.

Given the typical relationships between various constants we've maintained so far, producing a block in this case would create a chain that directly violates the Chain Growth invariant. You and I have discussed this before, and you emphasized that the protocol is supposed to make such violations impossible. And since the protocol ensures the Chain Growth invariant, the node shouldn't have to.

However, when a Byron node is "catching up" (which the relevant comments in the source identify as the only scenario in which we expect this case to arise), it is beyond the scope of the analyses in the paper. Only Ouroboros Genesis is designed to support "catch up" -- everything else (Praos, (P)BFT, etc) is just supposed to "do its best" as I understand it. The Praos paper in particular assumes that any new stakeholder has, by the time it joins, somehow already selected a chain that some other pre-existing node has most recently selected. (Maybe stakepool registration is enough to ensure this?)

So I think -- until we have the Genesis rule -- the node should go willingly choose to not forge if it can clearly see doing so will create a chain that violates Chain Growth.

(I agree that the hard fork clock concerns might make this question moot, and you and I already have plans to discuss that further. I'm just recording this prompt here somewhat anachronistically.)

What do you think?

nfrisby commented 1 year ago

We looked over this in the Consensus call. I do not think we should merge this PR. The main reason:

The performance bug under recent scrutiny is that forecasting is way more expensive than it can-and-should-be. Once that's fixed, the current code's behavior will be exactly what we want.

nfrisby commented 1 year ago

I had an idea (along the lines of what Javier did in his PR) to "centralize and cache the ticking". Something like the ChainDB maintains a ledger state that has been ticked to the wallclock (usually that means: it ticks by one slot every one second). The node kernel could use that for the leadership check and the new block.

The catch is that I think the ChainDB receives the newest remote block more than a second after that block was minted. And so the "ledger state ticked to the wall clock" would often be ahead of the ledger state needed to validate the newest remote block. If instead the ChainDB tended to receive the block during the block's own slot, then both that "wallclock ledgerstate" would suffice for both the node kernel's leadership check and the ChainDB's validation of the next remote block.

Edit: so, since we don't really have an easy way to force blocks to propagate within 1s (:D), this idea is probably a dead-end.

dnadales commented 1 year ago

Would this idea correspond to input-output-hk/ouroboros-network#4054 ?

nfrisby commented 1 year ago

I would say that idea is only partly related to Issue 4054. They both incrementally maintain a ticked ledger state as time passes. However, 4054 is just treating a symptom of a performance bug. My idea above (though I think it's a dead-end; see my edit above) would be trying to combine what are currently two separate ticking computations, and doing so would happen to remove the need for forecasting (which is the same as my comment on 4054).

nfrisby commented 9 months ago

Some comments on the above discussions, FWIW.

I would suggest that that last bullet point is pre-requisite for this Issue: is there even any scenario in which the block minted when the current slot's ledger view is not forecast-able would be relevant/useful?