Closed edsko closed 4 years ago
Ideally we'd merge these two sets of tests, but that's harder: https://github.com/input-output-hk/ouroboros-consensus/issues/725.
@nfrisby has been busy tracing down other problems. Moving this to the next sprint.
I've familiarized myself with the relevant parts of cardano-ledger
and the corresponding PDF. The basic changes needed for Network
seem pretty straight-forward (though I suspect I might need to introduce some additional synchronization so that nodes restart "immediately" after the onset of the first slot of the new epoch).
Please confirm two observations:
1) Adopted updates only take effect at the beginning of a new epoch.
2) There are only ~two~ (edit) three parts (what else?) of the static node configuration that need to be updated when a protocol update is adopted.
* ~The slot length (which is passed to the `realBlockchainTime` call in `Node.hs`) must be updated to the new `ppSlotDuration (adoptedProtocolParamters cvsUpdateState)`.~ (Edit: `ppSlotDuration` has not and never will change via a protocol parameter update proposal; see input-output-hk/cardano-ledger#642 and input-output-hk/ouroboros-network#1725.)
* The maximum mempool size (which is chosen as 2 the max block size of the "current" ledger when the node starts) must be updated according to the new `ppMaxBlockSize (adoptedProtocolParamters cvsUpdateState)`.
* ~The `byronProtocolVersion` field of the `BlockConfig ByronBlock` record needs to be updated to the new `adoptedProtocolVersion cvsUpdateState`.~ (Edit: it seems like that's not actually necessary; see my next comment on this Issue.)
* All other protocol parameter values (from what I've seen so far) are only read from `adoptedProtocolParamters cvsUpdateState` as-needed during the various validation routines in `cardano-ledger`.
cc: @edsko @mrBliss @nc6
I have to first figure out how these features are all supposed to work before I can write code to test it. This large comment is the result. Since it is essentially documentation and code review, is there a better place for it to live? I anticipate raising some Issues as a result of this review, but haven't done so yet. I plan to at least discuss the details with @edsko first.
From skimming the Byron ledger spec pdf and some of its references and inspecting the cardano-ledger
code, I've determined the overall sequence of events involved in a successful protocol update proposal -- I'm ignoring software update proposals for now, since they seem beyond the scope of our current test suite.
A protocol update proposal transaction is created . It proposes new values for some protocol parameters and a greater protocol version number as an identifier. There cannot be two proposals with the same version number.
Genesis key delegates can add vote transactions that refer to such a proposal (by its hash). They don't have to wait; a node could add a proposal and a vote for it to its mempool simultaneously. There are only positive votes, and a proposal has a time-to-live (see ppUpdateProposalTTL
) during which to gather sufficient votes. While gathering votes, a proposal is called active.
Once the number of voters satisfies a threshold (currently determined by the srMinThd
field of the ppSoftforkRule
protocol parameter), the proposal becomes confirmed.
Once the threshold-satisfying vote becomes stable (ie its containing block is >=2k slots old), a block whose header's protocol version number (CC.Block.headerProtocolVersion
) is that of the proposal is interpreted as an endorsement of the stably-confirmed proposal by the block's issuer (specifically by the Verification Key of its delegation certificate). Endorsements -- ie any block, since they all contain that header field --- also trigger the system to discard proposals that were not confirmed within their TTL.
https://github.com/input-output-hk/cardano-ledger/blob/172b49ff1b6456851f10ae18f920fbfa733be0b0/cardano-ledger/src/Cardano/Chain/Block/Validation.hs#L439-L444
Notably, endorsements for proposals that are not yet stably-confirmed (or do not even exist) are not invalid but rather silently ignored. In other words, no validation applies to the headerProtocolVersion
field.
Once the number of endorsers satisfies a threshold (same as for voting), the confirmed proposal becomes a candidate proposal.
At the beginning of an epoch, the candidate proposal with the greatest protocol version number among those candidates whose threshold-satisfying endorsement is stable (ie the block is >=2k slots old) is adopted: the new protocol parameter values have now been changed. If there was no stably-candidated proposal, then nothing happens. Everything is retained; in particular, a candidate proposal whose threshold-satisfying endorsement was not yet stable will be adopted at the subsequent epoch unless it is surpassed in the meantime. When a candidate is adopted, all record of other proposals/votes/endorsements -- regardless of their state -- is discarded. The explanation for this is that such proposals would now be interpreted as an update to the newly adopted parameter values, whereas they were validated as an update to the previously adopted parameter values.
In summary, the following diagram tracks the progress of a proposal that's eventually adopted. For other proposals, the path short circuits to a "rejected/discarded" status at some point.
active proposal
--> (sufficient votes)
confirmed proposal
--> (2k slots later)
stably-confirmed proposal
--> (sufficient endorsements)
candidate proposal
--> (2k slots later)
stably-candidated proposal (Frisby: stably-nominated?)
--> (epoch transition)
adopted proposal
I have some concerns that maybe should become Issues.
Figure 47 "State-transition diagram for protocol-updates" in the 2020 Feb 9 Byron ledger spec pdf shows four possible states of a proposal: "Active", "Rejected", "Confirmed", and "Adopted". Its use of "Adopted" is somewhat inconsistent with the pdf's text and the cardano-ledger
code, since the diagram -- strictly read as written -- indicates the proposal can reach "Adopted" before the epoch transition. It also only mentions the "2k slots" requirement for "Adopted" but not that endorsements require the confirmation itself to be 2k slots old. In summary, I think it should be expanded so that it subsumes my ASCII-art diagram above (which is only missing the "Rejected" state).
I think the re-(ab)use of srMinThd :: LovelacePortion
as a portion of genesis keys instead of as a portion of lovelace needs more discussion/explanation/motivation/justification. The cardano-ledger
code (https://github.com/input-output-hk/cardano-ledger/blob/172b49ff1b6456851f10ae18f920fbfa733be0b0/cardano-ledger/src/Cardano/Chain/Update/ProtocolParameters.hs#L178-L189) says "In Byron we do not have [the protocol parameter we actually want], so we have to use the existing ones" and the "Update adoption threshold:upAdptThd" bullet in the pdf says essentially the same. But is this a temporary workaround? Etc. In particular, I think it should address the implication that all genesis keys will continually hold equal stake. Something along the lines of, "this is just a rough approximation for Byron, we'll do better in Shelley", I'd guess? As it is now, it feels like I'm missing some context.
I created input-output-hk/cardano-ledger#731
The above has some implications for this Issue.
The current node implementation caches the headerProtocolVersion
upon startup; it's an argument to the top-level entry point Ouroboros.Consensus.Node.run
, but I don't know which repositories' code calls into that. I'd guess it's read from a config file/CLI argument?
More to the point: the node has to restart in order to change headerProtocolVersion
. This means the node has to restart even just to begin endorsing a proposal, even though that proposal's new parameter values are not yet adopted. That seems rather severe, but it is by design.
Some protocol parameter values are also cached on start-up (ppSlotDuration
affects clock ticks, ppMaxBlockSize
affects mempool max size, and maybe more?). ~so the node will have to restart at the beginning of any epoch that actually adopts a proposal too.~
~I think having the test nodes restart at the beginning of every epoch seems reasonable, as long as I take precautions to not e.g. discard the contents of the mempool. However, to also restart in order to start endorsing seems like it's going to be quite confusing. I don't see an alternative right now, unless we some how choose to make the source of the headerProtocolVersion
mutable.~
~Moreover, ouroboros-consensus
currently neglects changes to the ppSlotDuration
parameter; see Issue input-output-hk/ouroboros-network#1725 for more. There's a bit of a chicken-and-egg problem here: the ChainDB requires a BlockChainTime
in order to open, but to have a correct BlockChainTime
if ppSlotDuration
has changed over time requires reading the latest ledger validation state with the current slot ticked, in case that's an epoch transition.~
Edit: As it turns out, since ppSlotDuration
is not actually updatable and the use of a stale ppMaxBlockSize
for the mempool size won't actually affect consensus (as long as the max block size was never essentially 0), the node does not need to restart when a protocol parameter update proposal is adopted.
Edit: The answer on Issue 1725 implies that ppSlotDuration
has not and never will change via a protocol parameter update proposal. Also, Issue input-output-hk/ouroboros-network#1498 is to make the mempool's max size change dynamically in response to ppMaxBlockSize
updates.
@dnadales , could you confirm https://github.com/input-output-hk/ouroboros-network/issues/1514#issuecomment-592277721 ?
Adopted updates only take effect at the beginning of a new epoch.
Yes.
The slot length (which is passed to the realBlockchainTime call in Node.hs) must be updated to the new ppSlotDuration (adoptedProtocolParamters cvsUpdateState). (Maybe not: see input-output-hk/cardano-ledger#642 and input-output-hk/ouroboros-network#1725)
As far as I remember we wouldn't be changing the slot duration on protocol updates. At least not as a protocol parameter.
The maximum mempool size (which is chosen as 2 the max block size of the "current" ledger when the node starts) must be updated according to the new ppMaxBlockSize (adoptedProtocolParamters cvsUpdateState).
This sounds reasonable.
All other protocol parameter values (from what I've seen so far) are only read from adoptedProtocolParamters cvsUpdateState as-needed during the various validation routines in cardano-ledger.
Yup, seems right.
CC: @nfrisby
Thanks @dnadales !
As far as I remember we wouldn't be changing the slot duration on protocol updates. At least not as a protocol parameter.
This will happen, albeit indirectly: protocol updates determine the timing of the hard fork, and the hard fork can change the slot length. This is however the only place where slot lengths can change.
Thank you both.
protocol updates determine the timing of the hard fork
@edsko How does that happen? Are you referring to a software update proposal as opposed to a protocol (parameter) update proposal?
Uh, yes, I think that might be right. @dcoutts or @JaredCorduan can confirm.
I just bors
'd PR 1747 (edit: it has now merged). It adds a somewhat trivial proof-by-example that it's possible for the ThreadNet
tests to increment the Byron protocol version, which is the necessary precondition for the Byron-to-Shelley hard-fork. Every generated RealPBFT
test case now includes a proposal and votes and endorsements that should cooperatively update the protocol version at the ~next~ second epoch.
Without retuning the other RealPBFT
generators, it seems like half of the test cases successfully do so and half don't (e.g. they don't even reach the second epoch, or the proposal times out before enough votes, etc).
The test takes the most direct path:
@edsko Is that sufficient to close this Issue? If not, can we replace it with a lower priority/urgency Issue? Thanks.
Yes, we can close this. Opened input-output-hk/ouroboros-network#1768 for testing protocol parameter changes (medium priority), and input-output-hk/ouroboros-consensus#703 for testing non-ideal voting scenarios (low priority). There is also your existing input-output-hk/ouroboros-network#1733 about testing software version updates, which we think is still high priority; see https://github.com/input-output-hk/ouroboros-network/issues/1733#issuecomment-596961953.
We suggest to look at input-output-hk/ouroboros-network#1733 and then move on to to Shelley, and in particular, start looking at input-output-hk/ouroboros-network#1770 .
Let's first get https://github.com/input-output-hk/ouroboros-network/pull/1564 merged though, so that we can close input-output-hk/ouroboros-network#1297 (I guess that's almost done anyway).
The existing PBFT tests already deal with delegation certificates. They should be extended with update proposals/votes/endorsements. I think the general setup would be similar: "posting the delegation certificate" would correspond to posting an update proposal and having nodes vote on it, and then the proposal is endorsed when "enough" nodes are restarted (which is also required for delegation certificates). We should then also verify somehow that the update proposal is in the end endorsed -- for example, change a protocol parameter and then produce a block that depends crucially on the new parameter.