IntersectMBO / ouroboros-consensus

Implementation of a Consensus Layer for the Ouroboros family of protocols
https://ouroboros-consensus.cardano.intersectmbo.org
Apache License 2.0
31 stars 22 forks source link

Simplify the `protVer` arguments #324

Closed nfrisby closed 1 week ago

nfrisby commented 1 year ago

Without loss of generality, focus on Allegra. Start from the binding of protVerAllegra in protocolInfoCardano. EG at https://github.com/input-output-hk/ouroboros-consensus/blob/0079ee65f6c49d1c05a8d30ffa492b20c46b12b8/ouroboros-consensus-cardano/src/ouroboros-consensus-cardano/Ouroboros/Consensus/Cardano/Node.hs#L658

There, protVerAllegra:

Also, maxMajorProtVer:

mkShelleyBlockConfig ::
     ShelleyBasedEra era
  => SL.ProtVer
  -> SL.ShelleyGenesis (EraCrypto era)
  -> [SL.VKey 'SL.BlockIssuer (EraCrypto era)]
  -> BlockConfig (ShelleyBlock proto era)
mkShelleyBlockConfig protVer genesis blockIssuerVKeys = ShelleyConfig {
      shelleyProtocolVersion  = protVer
    , shelleySystemStart      = SystemStart  $ SL.sgSystemStart  genesis
    , shelleyNetworkMagic     = NetworkMagic $ SL.sgNetworkMagic genesis
    , shelleyBlockIssuerVKeys = Map.fromList
        [ (SL.hashKey k, k)
        | k <- blockIssuerVKeys
        ]
    }
data instance BlockConfig (ShelleyBlock proto era) = ShelleyConfig {
      -- | The highest protocol version this node supports. It will be stored
      -- the headers of produced blocks.
      shelleyProtocolVersion  :: !SL.ProtVer
-- | Praos parameters that are node independent
data PraosParams = PraosParams
    ...
    -- | All blocks invalid after this protocol version, see
    -- 'Globals.maxMajorPV'.
    praosMaxMajorPV        :: !MaxMajorProtVer,
data Globals = Globals
  ...
  , maxMajorPV :: !Version
  -- ^ All blocks invalid after this protocol version

https://github.com/input-output-hk/ouroboros-consensus/blob/0079ee65f6c49d1c05a8d30ffa492b20c46b12b8/ouroboros-consensus-cardano/src/ouroboros-consensus-cardano/Ouroboros/Consensus/Cardano/Node.hs#L754-L775

Note that this use is exactly contrary to the warning comments where allegraProtVer is actually set, such as:

https://github.com/input-output-hk/cardano-node/blob/70dde196367e86c977fbe72167f4eb303b3ffa88/cardano-node/src/Cardano/Node/Protocol/Cardano.hs#L185-L194

        Consensus.ProtocolParamsAllegra {
          -- This is /not/ the Allegra protocol version. It is the protocol
          -- version that this node will declare that it understands, when it
          -- is in the Allegra era. That is, it is the version of protocol
          -- /after/ Allegra, i.e. Mary.
          allegraProtVer =
            ProtVer (natVersion @4) 0,
          allegraMaxTxCapacityOverrides =
            TxLimits.mkOverrides TxLimits.noOverridesMeasure
        }

It is currently my suspicion that all of these per-era *ProtVer arguments should be removed (thereby eliminating (B)), and replaced by some logic that defines maxMajorProtVer by analyzing the last given transitionIntraShelleyTrigger. And the per-era values used in (A) should instead all simply use maxMajorProtVer.

amesgen commented 1 year ago

Minor comments:

Note that this use is exactly contrary to the warning comments where allegraProtVer is actually set, such as:

https://github.com/input-output-hk/cardano-node/blob/70dde196367e86c977fbe72167f4eb303b3ffa88/cardano-node/src/Cardano/Node/Protocol/Cardano.hs#L185-L194

Yeah, I think this is a case where we actually loose generality by only considering Allegra: Ignoring experimental eras for now (this just changes what the "last era" is), protVerXXX has the following meaning ATM:

Also see https://github.com/input-output-hk/ouroboros-consensus/pull/294#discussion_r1304111224

So in fact, we can safely use the protVer entry for the last era as the max major protocol version, despite the fact that protVerXXX is not a protocol version of the XXX era for all but the last era.

It is currently my suspicion that all of these per-era *ProtVer arguments should be removed (thereby eliminating (B)), and replaced by some logic that defines maxMajorProtVer by analyzing the last given transitionIntraShelleyTrigger.

That specific approach won't work directly:

The general approach can of course still work, we just have to pass in more stuff ("worst-case", just the max major protocol version directly).

And the per-era values used in (A) should instead all simply use maxMajorProtVer.

Intuitively, that sounds like a good plan; main question is: do we ever want forging nodes to put different versions in their header for different eras? Maybe a statistical use case similar to https://github.com/input-output-hk/cardano-node/blob/70dde196367e86c977fbe72167f4eb303b3ffa88/cardano-node/src/Cardano/Node/Protocol/Cardano.hs#L210-L213 could arise?

nfrisby commented 1 year ago

protVerXXX has the following meaning ATM

The changing semantics of the field---especially in a way dependent on the --experimental flag---is confusing for a "configuration data" data structure😬.

That specific approach won't work directly.

Yeah, I figured there isn't quite enough information/it wasn't accessible right now. But this still seems right to me as a high-level design target.

do we ever want forging nodes to put different versions in their header for different eras?

In general, sounds useful. And indeed you linked to a former example use (which correctly comments itself as a "HACK" :/).

But it also sounds not strictly necessary. So at the very least, I think it should be provided in a separate "overrides" argument, eg (perhaps limited to the minor version?). More directly: I think we should remove it now and add something like that back when it's next actually needed.

nfrisby commented 1 year ago

PR #294 is definitely related. For context: I hadn't looked over that PR before drafting this Issue.

nfrisby commented 1 year ago

PR https://github.com/input-output-hk/ouroboros-network/pull/3891 also seems related

Also, I transferred Issue #325 from ouroboros-network repo.

Wise final words from Jared in his comment:

but it will take some thought, and updating the specs

nfrisby commented 1 year ago

The MaxMajorPV was added in this commit https://github.com/input-output-hk/cardano-ledger/commit/441e23bc7541ab0a535e9d24cda1d836c04b64ed

Edit: Just for context: that commit is from March 2020, which is before the Byron-Shelley hardfork.

nfrisby commented 1 year ago

Esgen and I just discussed this all during a call. There are several overlapping concerns, sadly.

Testing overrides

For testing, we let config override the version-based triggers in the HFC.

Experimental flags

There are two "experimental" flags in the node: ExperimentalHardForksEnabled (<-> npcExperimentalHardForksEnabled) (formerly TestEnableDevelopmentHardForkEras) and ExperimentalProtocolsEnabled (<-> pncExperimentalProtocolsEnabled) (formerly TestEnableDevelopmentNetworkProtocols).

The node ultimately passes ExperimentalProtocolsEnabled to Consensus as srnEnableInDevelopmentVersions.

  , srnEnableInDevelopmentVersions  :: Bool
    -- ^ If @False@, then the node will limit the negotiated NTN and NTC
    -- versions to the latest " official " release (as chosen by Network and
    -- Consensus Team, with input from Node Team)

The node uses ExperimentalHardForksEnabled to alter the upper bounds on the *ProtVer arguments it passes to Consensus. Re-read the Issue description to understand the consequences this flag therefore has. For example:

          Praos.babbageProtVer =
            if npcExperimentalHardForksEnabled
              then ProtVer (natVersion @9) 0
              else ProtVer (natVersion @8) 0,

It is not obvious to me how these two flags interact in general. In particular, suppose we have an experimental era, such as Conway right now. Is it necessarily the case that ExperimentalProtocolsEnabled enables (at least) the NTN and NTC versions that Conway requires? And so we should require that ExperimentalProtocolsEnabled implies ExperimentalHardForksEnabled?

When we instead don't have an experimental era, ExperimentalHardForksEnabled has no effect and so that implication wouldn't be necessary? For example, without an experimental era, ExperimentalProtocolsEnabled could be guarding merely new queries.

The Obsolete Node Checks

The MaxMajorPV mechanism introduced here https://github.com/input-output-hk/cardano-ledger/commit/441e23bc7541ab0a535e9d24cda1d836c04b64ed is older than the HFC. At a high-level, the HFC renders MaxMajorPV unnecessary: if the node doesn't yet know about the Nth ledger era, then it won't be able to even parse a header in that era. However, because of "intra-era hard forks", it may be that MaxMajorPV is necessary even in the presence of the HFC.

It seems possible that we could ban intra-era hard forks in the future. Or maybe even implement them as duplicates in the HFC type-level list (but that would be inconvenient in terms of performance, and Issue https://github.com/input-output-hk/ouroboros-consensus/issues/328, eg). Either would let us remove the MaxMajorPV mechanism, and instead rely only on the HFC for this.

Type-level ledger era major version intervals

The ledger statically assigns intervals of the major version to each era. For example:

https://github.com/input-output-hk/cardano-ledger/blob/a90833107aa98f359dbb839991582b397c1c4bc8/eras/allegra/impl/src/Cardano/Ledger/Allegra/Era.hs#L20-L26

-- | The Allegra era
data AllegraEra c

instance Crypto c => Era (AllegraEra c) where
  type PreviousEra (AllegraEra c) = ShelleyEra c
  type EraCrypto (AllegraEra c) = c
  type ProtVerLow (AllegraEra c) = 3

Alonzo is different because it contains a so-called "intra-era hard fork" (ie a hard fork that does not change the ledger era, and so does not involve the HFC).

https://github.com/input-output-hk/cardano-ledger/blob/a90833107aa98f359dbb839991582b397c1c4bc8/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/Era.hs#L21-L28

-- | The Alonzo era
data AlonzoEra c

instance Crypto c => Era (AlonzoEra c) where
  type EraCrypto (AlonzoEra c) = c
  type PreviousEra (AlonzoEra c) = MaryEra c
  type ProtVerLow (AlonzoEra c) = 5
  type ProtVerHigh (AlonzoEra c) = 6

If it weren't for the testing overrides and the experimental flags, I think all of the node's use of protocol versions would be determined by those ProtVerLow and ProtVerHigh instances---even intra-era hardforks, since ProvVerHigh captures those. Sounds appealing.

coot commented 1 year ago

My mental model, although I know very little about the ExperimentalHardForksEnabled (so I also feel the need to sync on that), is that consensus and network flags are independent in general. However there might be hard forks which require to set both flags, or just one of them. You're right that TestEnableDevelopmentNetworkProtocols can limit queries, which is relatively small misconfiguration (doesn't break node-to-node, but might prevent testing some its features), but in the future we might have new consensus algorithm which will require an entirely new network (mini-)protocol (e.g. ouroboros-peras or ouroboros-laios). There's yet another level of complication: it could happen that we will have an experimental network protocol change which we don't want to enable, while consensus also requires another experimental feature from the network. Do we need to worry about this before it happens?

nfrisby commented 1 year ago

Do we need to worry about this before it happens?

I think the only risk of not considering this would be surprise delays in releases (which are bad, but not as bad as bugs eg).

coot commented 1 year ago

I think so too.

Right now Handshake assumes that all versions are linearly (/ totally) ordered. But we could relax it to be a partial order, which would allow things like:

graph TD;
  1'-->0;
  1-->0;
  2-->1;
  2-->1';  

where 1 and 1' are the non comparable two different experimental version. Consensus could peak it's own max network version, the same with network. Assume that 1 contains some consensus feature while 1' some network features, and we don't want to publish both (e.g. 2). Then the network team would pick 1 as it's max version, and consensus would peak 1' as its max version.

Our picking algorithm would need to pick the smallest sub-partial order which is closed under supremum. This way if one enables only consensus experimental features one would end with versions smaller or equal to 1; if one only enables network experimental features one would end with versions smaller or equal to 1' and if one picks both then all versions smaller or equal 2.

Handshake would do the same as now: pick the largest version which belongs to both own supported versions and the presented one.