ConsenSysMesh / cava

ConsenSys core libraries for Java & Kotlin
Apache License 2.0
84 stars 34 forks source link

Support "tuples" and "lists" from SSZ Spec v0.5.1 #207

Open schroedingerscode opened 5 years ago

schroedingerscode commented 5 years ago

@atoulme and/or @cleishm, if you wouldn't mind giving a little thought to the below proposal, I'd be most appreciative. I've spent a little bit mulling this over and plan to take a first stab at a PR shortly, but in the meantime an extra set of eyes doesn't hurt.

Description The SSZ spec was rewritten as part of the eth2-specs v0.5.1 release to include both fixed-length and variable-length arrays/lists. Following the python notation used in the eth2-specs, and for the sake of clarity, fixed-length arrays/lists will be referred to as "tuples" and variable-length array/lists will be referred to as "lists".

(Not so) Quick Background

The implications of the above are that "list" requires a length mixin, where as a "tuple" does not.

Proposed Changes

Additional Information See https://github.com/ethereum/eth2.0-specs/blob/v0.5.1/specs/simple-serialize.md.

cleishm commented 5 years ago

I must admit, I don't quite get this. UInt8 types are serialized as a single byte already. Large integers are serialized as little-endian byte arrays. Because SSZ contains no type information, you have to know what was serialized when you deserialize later.

A Bytes value has a variable length, so it is always serialized with a length prefix.

  • When Java bytes are serialized (i.e. when recursively serializing a Cava Bytes type), they should be serialized as a uint8.

When serializing a Bytes type, each byte of the array will appear as a single byte in the serialized output - which is the same as a uint8.

  • A Cava Bytes type of length 1 should be serialized as a uint8. [still undecided on this one]

I'd argue no on this, because it would make it impossible to use the SSZ API to create a serialization of a list of one byte.

  • A Cava Bytes type of length >1 should be treated and serialized as a "list".

This is the current behavior.

  • All fixed length Cava BytesN types (i.e. Bytes32) should be treated and serialized as a "tuple".

As above, I'd argue no on this.

  • Support will be added for serializing Bytes... (varags list) as either a "list" or a "tuple"
  • Support will be added for serializing List<? extends Bytes> as either a "list" or a "tuple".

The Cava SSZ API doesn't have methods for encoding/decoding sequences of known length (tuples), as these were added to the SSZ spec more recently. I agree these should be added. E.g. Bytes SSZ.encodeByteTyple(int length, Bytes value), Bytes SSZ.decodeByteTuple(Bytes source, int length).

schroedingerscode commented 5 years ago

Thanks for thinking through this one Chris. I made it down into the Cava SSZ weeds, and see most of what you're talking about. I'll concede that implicitly serializing things with/without the length mixin based on their type name might not be the best idea. I'll continue down the path of adding explicit API methods for fixed-length lists/tuples.