Open anorth opened 5 years ago
Problem: future-proofing.
Quoth @jbenet
in light of evolving protocols, security oriented protocols that serialize into non-self-describing formats take great care to ensure fields are appropriately tagged to ensure the right serialized field value is serialized/deserialized into the right in-memory field. protobuf, capnp, and more enforce this, and have for decades, for precisely protocol evolution and security. deserializing field A into field B is a class of bug trivially defeated and not worth exposing ourselves to.
- this compounds as formats change and programs (which do not all update in lockstep_ continue to read old and new versions of structures).
- this is made specially worse in hash-linked data structures which cannot be upgraded by migrating data, but instead tend to force all programs in the future to read old structure versions. field tagging is key for secure schema evolution
Another annoyance I have just learned about is that tuple-encoding does not play nicely with graphsync. IPLD selectors operate over the encoded IPLD nodes, which in this case will be lists. So selectors for chain syncing need to be expressed with indices, rather than field names.
This is not the end of the world, of course. We can declare (or even reflect) a mapping of field name->index and use symbolic constants to construct queries.
cc @hannahhoward @icorderi @whyrusleeping
The spec is very light on details about the serialization/encoding of on-chain structures. At present, CBOR is not mentioned but a few structs have a
// representation tuple
comment.I believe the intention at present is for all structures to be CBOR-tuple encoded (i.e. a CBOR array with items corresponding to struct fields in their order of declaration). This is efficient but has some potential problems. I'm filing this issue so that we have them written down somewhere.
@jbenet's most recent declaration is: