IntersectMBO / ouroboros-consensus

Implementation of a Consensus Layer for the Ouroboros family of protocols
https://ouroboros-consensus.cardano.intersectmbo.org
Apache License 2.0
29 stars 21 forks source link

The Big Decontamination Plan #386

Open edsko opened 4 years ago

edsko commented 4 years ago

This would be quite a large refactoring, but it would simplify a large part of the codebase, and means that going forward we can forget about EBBs entirely.

If we didn't care about preserving history, we could have rewritten the existing chain, rewriting nothing but the prev-hashes of every block, skipping any EBBs. We do want to preserve history, and so we cannot do this, but we can do the next best thing: we can introduce a hashmap mapping the hash of every EBB to its predecessor. This way we can do a local translation (in the HasHeader instance for ByronBlock) so that if we have blocks A - EBB - B, we rewrite the prev-hash of B on the fly to be A rather than the EBB. Now we can pretend EBBs don't exist, we don't store them, we don't send them, we have no SlotNo clashes, we have no BlockNo clashes, soooo many things in the consensus layer and storage layer could be made simpler, at the cost of one static value and HasHeader instance for a block type that is anyway no longer going to be used.

mrBliss commented 4 years ago

Possible plan:

To transition to an EBB-free chain, there will need to be a transition period in which we are both able to serve an EBB-contaminated chain and an EBB-free chain. Old Byron nodes (NodeToNodeV1/2 and NodeToClientV2) still expect EBBs and that's what they'll get. However, new Byron-to-Shelley nodes (NodeToNodeV3 and NodeToClient3) should not receive EBBs anymore. Once we have switched to Shelley, we will know that only nodes that are capable of handling an EBB-free chain are active, after which we can start making more invasive changes to get rid of EBBs once and for all. E.g., migrate the database, remove all EBB-related edge cases, etc.

This means that new Byron-to-Shelley nodes should both be capable of serving an EBB-contaminated chain (to remain compatible with old Byron nodes) and serving an EBB-free chain (to new Byron-to-Shelley nodes). The ChainDB will need to provide EBB-contaminated Iterators and Readers as well as EBB-free ones. The latter can be implemented on top of the former as it knows which blocks are EBBs (it can use the GetIsEBB block component internally).

Something else we should check: there is a staging net, which might also contain EBBs. To keep that working, we'd need to include its EBB mapping in our hard-coded one. Alternatively, they can rewrite it to be EBB-free :grin:.

If we want to use the Shelley release as the opportunity to start this process, we should do the following before the Shelley release:

Note that LedgerCursor doesn't need to change, because the in-memory representation of the LedgerDB will still have snapshots for EBBs (until we get rid of them altogether, which is after the Shelley release), so a new Byron-to-Shelley node will just never request a cursor for an EBB. In other words, we'll have more points to roll back to, which is backwards-compatible.

To test this properly, it would be nice to be able to run a mix of old and new nodes.

edsko commented 4 years ago

We'll need to remove blockPrevHash (and ideally blockInvariant) from HasHeader: rewriting of the prev hash must be conditional, because the Volatile DB needs to construct fragments of headers that optionally include EBBs (depending on the network version of the node requesting the blocks).

edsko commented 4 years ago

We also have to be careful with the other direction: nodes that do translate EBBs away but are syncing from nodes using version 1 or 2 of the protocol.

edsko commented 4 years ago

As of https://github.com/input-output-hk/ouroboros-network/pull/2335 the context for translating prev hashes is now available in principle. We have to make sure, however, that the Byron ledger itself does not check prev hashes when applying blocks. It shouldn't, consensus is responsible for header validation, but it might; if it does, we'll have to remove that check from Byron (it would have been superfluous anyway).

edsko commented 4 years ago

Since it seems we won't get a chance to push this out quickly, it makes sense to instead opt for a longer-term but more ideal plan:

  1. input-output-hk/ouroboros-network#2264: Remove the distinction between ImmDb/ImmutableDB and VolDB/VolatileDB (LgrDB and LedgerDB should probably stay: one is pure, one is not)
  2. input-output-hk/ouroboros-consensus#661 (in particular, https://github.com/input-output-hk/ouroboros-consensus/issues/661): Migrate the immutable DB secondary index, and as we do so, add

    data PrevHint = PrevIsEBB (ChainHash ByronBlock) | PrevIsRegularBlock

    into the nested context for Byron. This will then allow us to do the on-fly rewrite of the chain based on local information only, without the need for any global EBB map. This will make testing also significantly easier. This migration can be done on the fly ("JIT migration"), and does not need to read the blocks, except to establish the PrevHint of the first regular block: if there is an EBB present, we need to read it to get its prev hash (for PrevIsEBB); if there is no EBB present, it would just be PrevIsRegularBlock.

  3. We can now proceed as above, except with local information.
edsko commented 4 years ago

Possibly related/required: Add IsEBB to HeaderFields input-output-hk/ouroboros-consensus#639

mrBliss commented 4 years ago

We should implement https://github.com/input-output-hk/ouroboros-consensus/issues/645 first so that we can test that a node can both serve EBB-contaminated and EBB-free chains.

edsko commented 4 years ago

Observation from @mrBliss : how will this work with nodes syncing? Even if they don't want EBBs, they must still request them in order to be able to give them to nodes that do want them?

One thought: perhaps we should be able to get nodes in the system that only speak the new protocol, and so cannot support such "legacy" nodes..? :thinking:

edsko commented 4 years ago

Observation from @mrBliss : how will this work with nodes syncing? Even if they don't want EBBs, they must still request them in order to be able to give them to nodes that do want them? One thought: perhaps we should be able to get nodes in the system that only speak the new protocol, and so cannot support such "legacy" nodes..? thinking

Perhaps version negotiation could help here also, and it might in fact help with the phasing out. What if nodes could declare "I cannot serve EBBs", by declaring "I don't support this version number"? If you are an old style node, you need to find an upstream peer that does do it (this will require input-output-hk/ouroboros-consensus#646). This means that now we don't need this fixed support anymore within a single node: either you have EBBs, or you don't. Instead of having a single node support both EBB-contaminated and EBB-free chains, instead we can have a network of nodes, some running version 1, some version 2, so that's where we get the heterogeneity.

edsko commented 4 years ago

After EBBs are removed entirely, we can probably scrap the whole CodecConfig concept (and remove the then-redundant parameters from ProtocolClientInfo).