EspressoSystems / espresso-sequencer

86 stars 56 forks source link

Implement backwards-compatible, version-aware deserialization for header #1648

Open jbearer opened 1 week ago

jbearer commented 1 week ago

We did not consider versioning when we first designed header. Thus, the Header struct from 0.1 does not start with a version number or variant identifier. It is thus impossible to deserialize into a versioned Header enum without knowing ahead of time what version we want to deserialize....seemingly. To get around this, we considered a workaround with a new VersionedSerialize trait, EspressoSystems/HotShot#3309. However, this has major drawbacks because it does not compose nicely with the serde Deserialize trait when Header is nested within another deserializable struct.

Now we consider a different workaround so that we can actually implement Deserialize for the versioned Header enum, thus composing nicely with serde. We take advantage of the fact that the current Header type starts with an enum, ResolvableChainConfig (which internally is just an Either<ChainConfig, Commitment<ChainConfig>>), and that with the default derivation of Deserialize for enums, adding a variant to the end is backwards compatible. Thus, we can define a new type

enum EitherOrVersion<L, R> {
  Left(L),
  Right(R),
  Version(Version),
}

Because this enum has the same variant names and types in the same positions (0 and 1) as Either, it serializes and deserializes identically for these two variants, but it will also deserialize successfully in one additional case, where it is neither L nor R, but instead contains a Version. We can thus define

struct ResolvableChainConfigOrVersion {
  chain_config: EitherOrVersion<ChainConfig, Commitment<ChainConfig>>,
}

which is backwards compatible with ResolvableChainConfig.

v0_1::Header will serialize and deserialize exactly as it is now. Future versions will deserialize as if they were a struct with just two fields:

struct VersionedHeader {
  version: ResolvableChainConfigOrVersion,
  fields: Header,
}

Where Header contains the fields of interest for that version. Header can derive the normal implementations of Serialize and Deserialize, which saves some boilerplate as we will see below.

We can now implement Serialize and Deserialize for the top-level Header enum as follows:

fn serialize<S: Serializer>(&self, s: S) -> Result<S::Ok, S::Error> {
  match self {
    Self::V1(header) => header.serialize(s),
    Self::AnyOtherVersion(fields) => VersionedHeader { version, fields}.serialize(s),
  }
}

fn deserialize<D: Deserializer>(d: D) -> Result<Self, D::Error> {
  struct Visitor;

  impl serde::Visitor for Visitor {
    fn visit_seq<A>(self, seq: A) -> Result<Self::Value, A::Error>
    where
      A: SeqAccess<'de>,
    {
      let chain_config_or_version: ResolvableChainConfigOrVersion = seq.next_element()?;
      match chain_config_or_version.chain_config {
        // For v0.1, the first field in the sequence of fields is the first field of the struct, so we call a function to get the rest of
        // the fields from the sequence and pack them into the struct.
        EitherOrVersion::Left(cfg) => Header::V1(v0_1::deserialize_with_chain_config(cfg.into(), seq)?),
        EitherOrVersion::Right(commit) => Header::V1(v0_1::deserialize_with_chain_config(cfg.into(), seq)?),
        // For all versions, the first "field" is not actually part of the `Header` struct (you can think of it as the first field of the virtual
        // `VersionedHeader` struct, with the `Header` being the second field. We just delegate directly to the derived deserialization
        // impl for the appropriate version.
        EitherOrVersion::Version(0.2) => Header::V2(seq.next_element()?),
        EitherOrVersion::Version(0.3) => Header::V3(seq.next_element()?),
        etc
      }

      fn visit_map(...) {
        // The analogous thing, but for serialization formats where structs serialize as a map instead of a tuple (e.g. JSON).
      }
    }
  }
}

For v0_1::Header, because it serialized as a flat struct with the version indicated by the first field (chain_config) we need to write a special function for deserializing the remaining fields given the first one:

impl v0_1::Header {
  fn deserialize_with_chain_config<'de, A>(chain_config: ResolvableChainConfig, fields: A) -> Result<Self>
  where
    A: SeqAccess<'de>
  {
    Ok(Self {
      chain_config,
      height: seq.next_element()?,
      timestamp: seq.next_element()?,
      etc
  }
}

We don't have to do this for future versions because the serialization is better