Consider possible improvements to the SSZ spec before phase0 is launched

zah commented 4 years ago

One of the design goals of SSZ is that it should make it easier for other blockchains to work with merkle proofs referencing Eth2 consensus objects. Once phase0 is launched, we can expect various official SSZ records to start appearing in third party databases. This would significantly increase the difficulty of coordinating upgrades to the SSZ spec (due to the limited forward compatibility provisions in SSZ, a lot of applications may get broken in the process). Due to this, I think we should consider introducing some final refinements and optimisations to the SSZ spec before phase0 is launched:

1) Reduce the size of every variable-size container by 4 bytes.

Every variable-size container (i.e. record with fields) consists of a fixed-size section storing the offsets of the variable-size fields.

The offset of the first such field currently has only one valid value - it must be equal to the length of the fixed-size section. The implementations are expected to check for this, because otherwise there might be some unused bytes in the SSZ representation which is considered an invalid encoding.

The motivation for not allowing unused bytes is that this would break the property deserialize(serialize(x)) == x which is quite useful for fuzzing. For completeness, I would mention that if unused bytes were allowed, a very limited form of forward-compatibility will be present - it would be possible to add a new field at the end of a record without breaking older readers. Since SSZ upgrades require coordination and all long-term storage applications should also feature an out-of-band version tag, this limited form of forward-compatibility was considered unnecessary.

In other words, since the first offset has only one valid value that is completely derived from the type schema, the offset carries no information and can be omitted from the representation. The result will be that every variable-size container will be 4 bytes shorter. Admittedly, 4 bytes are not much, but if we consider the long expected life of the SSZ spec and great multitude of places where SSZ records might appear, some quick back-of-the-envelope calculation estimated the total cost savings in bandwidth and storage to amount to roughly 1 gazillion bytes :P

2) Null-value optimisation (a.k.a better support for pointer types and `Option[T]`)

The SSZ spec defines union types that can discriminate between null and a possible value. Let's call such types Nullable. Since the Nullable types have variable size, their length in bytes can be zero (just like how we encode zero-length lists with two consecutive offsets with the same value). I propose the addition of the following two special rules:

The null value of a Nullable union is encoded as zero bytes.
A union with just one non-null branch is encoded without a serialized_type_index.

Please note that in most programming languages, the unions described above can be mapped to frequently used types such as Option[T] or a pointer type. During the development of the blocks_by_range protocol, an earlier version was suggesting that missing blocks should be indicated in the response as a default(T) encoding of the BeaconBlock type. This was semantically equivalent to using an Option[T] type, but it would have been considerably more inefficient. The design of the protocol was refined in later versions to not require this form of response, but I think that if one of the very first protocols was that close to using and benefiting from the Option[T] type, we can expect more protocols to appear in the future that will benefit as well.

3) Resolve a contradiction in the SSZ List limit type

The SSZ spec doesn't specify what is the type of the list size limit. This leads to something that can be described as a slight contradiction in the current specs:

The size limit of the validator registry is set to 1099511627776 (2^40). On the other hand, the maximum size in practice is limited in the encoding to the difference of two offset values. Since the offset values are encoded as uint32, the maximum size in practice cannot be larger than 2^32. Perhaps the intention for the size limit is that it should only affect the merkle hash computation, but the spec would do nice to clarify this.

protolambda commented 4 years ago

Hey, thanks for opening this issue, this is very constructive :+1:

Some of these we have already discussed to some extent before, but I will share my thoughts in depth here so others can review.

Reduce the size of every variable-size container by 4 bytes.

This is more of debate to choose between consistency and optimization, there are definitely arguments for both. I can think of:

Having the 4 bytes:
- An SSZ routine in a smartcontract or other place can be more general, and thus smaller.
- One less edge-case to worry about (general point again, but for security)
- the fixed-length part of the type information becomes a harder requirement to read the affected dynamic length field.
Not having 4 bytes:
- save 4 bytes in each container that contains dynamic-length field
- Not having to check the first offset, because it's not there.

And the most common examples, Attestation [229, 485) and BeaconBlock [304, 124420) (sizes can be found and computed here) are big enough IMHO that a few of these occurrences do not make too much of a difference, so I am hesitant to optimize it.

And yes, I hope you are right this is a gazillion byte difference, but that's more of a thing to do with widespread usage. The variance in snappy ompression of different values will likely already be higher.

unused bytes is that this would break the property deserialize(serialize(x)) == x

IMHO it doesn't break the property, serialize(x) is expected to be valid, and deserialize validation checks handle it fine. No information is lost in the process. But instead, a few unused bytes are present. These validation checks are put in place for the first offset (based on compile-time fixed-length-part size information), and this is very similar to the first-offset check of vectors and lists of dynamic fields. There we check if the first offset matches the fixed-length part (vectors) or determine the length from it and check the limit (lists). So I prefer to keep the symmetry between types there.

And bitfields, as well as boolean, have validation constraints like this on a bit-level; the (last) byte can only have so many bits set. Not all arbitrary byte strings can be valid SSZ because of that.

TLDR: I prefer consistency for now, unless others make a compelling new argument why we need this.

Also, thanks for tracking and documenting this optimization idea :+1:

Null-value optimisation

Regardless of the whole null debate, I think it's still a good idea to support explicitly nullable types.

And while we are it it with modifications related to null, if we like to have null outside of union as well, we may want to better define its type. null is 0 bytes, so this also means it cannot be fixed-length. Or we have to put element-type restriction on containers like Container and Vector, to avoid them only having this as element type, and causing divide-by-zero issues. We can declare it as a dynamic-length type and we should be fine. An Union is dynamic length anyway, so it doesn't matter there.

And once we have a null type for serialization, we should also look at hash-tree-root. Declaring it as an always fully zero Root could work, and we could use it for padding purposes in applications, where we deprecate a field, but want to keep the same merkle structure. Does that make sense?

A union with just one non-null branch is encoded without a serialized_type_index

I think this condition is sufficient, null as value in the union should already be 0 bytes.

This definitely deserves a draft / proposal somewhere. Maybe in the new SSZ repo, but I wouldn't mind a PR here, as the new SSZ repo has received mixed support signals.

And right now null and Union are not used anywhere, and client implementations either experimented with it, or completely ignore it. So I think there's room to make bigger changes here if necessary.

Resolve a contradiction in the SSZ List limit type

The SSZ spec doesn't specify what is the type of the list size limit

The idea is that like in Go and some other languages, we can defer saying anything about it until it's used anywhere. But agree it's not very clean, and offsets are a problem, while merkleization fits fine (256 bit integer length mix-ins, although not fully used in practice).

validator registry is set to 1099511627776 (2^40) On the other hand, the maximum size in practice is limited in the encoding to the difference of two offset values.

You are right, and that should be fixed. With the current validator size the practical limit would be: (2**32)/121 = approx 35 million validators, plenty for now, but yes, the 40 number was a guess to accommodate withdrawn validator slots; once unused, the slot will be forever empty (and should be pruned from the state in the client view of the state as a tree, which seems like the go-to approach once we reach enough validators to go through that kind of churn at least). With the amount of ETH, deposit time, an activation queue, etc., I don't think we'll hit this anytime soon.

it should only affect the merkle hash computation

Yes, one nice side-effect here is that the tree is big enough to later fit other types on it, without breaking the existing merkleization. It's still very far away, but it offers some flexibility. We can ignore it for now though, and should focus on the offsets part of the problem.

One theoretical option for offsets is that we group elements to 2**32 / fixed_element_size long runs, and have a second layer of offsets, etc. This starts to become unrealistic very quickly though, as we don't actually expect to fill available system memory with a single big SSZ element. Instead, we should just plan to transition to a more tree-like type for these large lists, when we need them. When the Validator type changes in phase 1, and we have to change the registry anyway, that may be a good timing.

Meanwhile, the sparse-merkle-tree issue is still open, and collecting dust. If we had a place for a more concrete draft, we may make more progress there. For phase 0 it is no hard requirement though.

zah commented 4 years ago

An SSZ routine in a smartcontract or other place can be more general, and thus smaller.

During the development of the SSZ implementation in Nim, I've paid great attention to the size of the generated code. Since Nim has the ability to compute a lot of metadata about the serialized type at compile time, it does allow me to produce very short and specialized code for selective traversal (reaching a particular field of interest) and full de-serialization of specific types. I believe our implementation will be quite popular with smart contract authors trying to produce the shortest possible WebAssembly code for extracting data from SSZ records or for verifying merkle proofs.

With the 4-byte reduction in place, the code that our implementation can generate doesn't become longer, but rather it becomes shorter. The reason for this is that the optimisation only modifies some offset positions that are computed at compile-time and it does remove an unnecessary verification step which results in slight code size reduction.

protolambda commented 4 years ago

Awesome, that sounds very promising for smart-contracts. I still wonder about the general case without compile time optimization here though. Where the contract uses an ssz library or some other type of abstraction, to read data based on more dynamic traversal.

And I would like containers and vectors to be consistent: if we're removing it from a container, then Container {a: List[uint64, 2], b: List[uint64, 2], c: List[uint64, 2], d: List[uint64, 2]} becomes different than Vector[List[uint64, 2], 4] in serialization. So we can't have effortless named tuples anymore. And removing the first offset of a dynamic-length element vector creates a special case in the lookup pointer math: if it's index 0, then don't read or add the offset to the current pointer. It's pretty small things like this though, maybe removing the offset is worth it, I'm undecided now.

For now let's minimize changes that affect testnets, but maybe later we can introduce this optimization if it gets welcomed by other implementers.

zah commented 4 years ago

Well, Vector[List[uint64, 2], 4] needs just 3 offsets as well, because the first element will always start at position 12 (3 offsetSize). So, a vector* can be consistent with a container.

Removing the first offset from a variable-size list is not possible in the same way, because the offset carries information (it determines the length of the list).

protolambda commented 4 years ago

So, a vector can be consistent with a container. Removing the first offset from a variable-size list is not possible in the same way

Yes I understand. What I am thinking of is that for vectors, having the first (strictly speaking unnecessary) offset makes sense for other purposes. Avoiding an edge-case before the pointer math involved in the lookup, and consistency with lists.

We could change vector to stay consistent with containers, but I'm unsure about that trade-off. So I'm undecided, but looking for stability. Let's please avoid affecting testnets for now, and ask others for feedback in the meantime. The other two points are better targets for improvements right now :+1:

ethereum / consensus-specs