celestiaorg / celestia-specs

Celestia Specifications
Creative Commons Zero v1.0 Universal
51 stars 14 forks source link

Reconsider serializing block data into shares #152

Closed liamsi closed 2 years ago

liamsi commented 3 years ago

Description

I wanted to bring up everyone's favorite topic again. Namely serialization.

First, I want to note that the way we arrange data into shares is essentially a custom serialzation format. It's a very light-weight and smart one and additionally uses (length delimited) protobuf for the hard parts (which IMO is a good choice).

Despite the theoretical simplicity, we ourselves are not having the best time implementing the interesting parts (correctly). Namely the splitting and merging of shares. This is not optimal because every client that does DA sampling or trying to download particular Txs directly from the p2p network, will have to do the merging part, too. A custom serialzation format is an additional burden and hurdle for writing clients in other languages (e.g., Rust, python, Javascript, etc). IMO implementing that part should not take longer than a few hours, maybe a day in any language. We now have proof that this takes (much) longer even for those who either invented the format or at least are deeply familiar with it. Of course, we could blame this on go but the reality is that every other language also has its pitfalls and tradeoffs, too.

Proposal

I'm proposing to describe shares in protobuf too. The fields of the share proto message would be the namespace, the startIndex (optional) and the total length of the serlialized message, and finally, the part of the message in that particular share (in bytes) resulting from the splitting operation. That way the spec really would just need to describe the split and merge algorithms and nothing else. We'd need to be careful that we do the splitting in a way that still results in equally sized protobuf messages.

I know, I've been kinda advocating against going all-in on protobuf for the shares myself. At least in the sense that I wanted to keep out any knowledge of the serialization format used out of the NMT (which IMO should simply take in raw bytes and nothing else). I'm not sure yet if the implication here would be that we'd need to move away from this property.

related: https://github.com/lazyledger/lazyledger-specs/issues/35

┆Issue is synchronized with this Asana task by Unito

evan-forbes commented 3 years ago

That's an interesting idea, and definitely needs more investigation. Provided that we could get exact sized shares, I think it could be a good simplification measure. Although, I'm still not sure how much it would simplify things, as we're already mostly working with raw bytes, and would still have to manually split/merge.

I'm not the most familiar with protobuf, how would we get the namespace before the length delimiter?

We're already using protobuf for evidence, would we first serialize evidence, then create the shares, then serialize the shares?

adlerjohn commented 3 years ago

we could blame this on go

Agreed 🦀

the total length of the serlialized message, and finally

The problem with this suggestion is that this length does not repeat each share. So you'd still need at least 2 serialization passes: one to length-delimit the tx/message/whatever, then another to serialize the share struct after splitting.

The issue with using protobuf for this second step is that protobuf doesn't support fixed-length arrays.

liamsi commented 3 years ago

we could blame this on go

Agreed 🦀

Good that you bring this up. We should likely try to implement the current scheme in Rust as it is the most interesting language we should target after go anyways. That said, I'd be surprised if the time to finish a fully tested implementation in Rust would be less than it took us to implement this in go (maybe it will actually be faster because we've ironed out a lot of the details now - but it will be the same ballpark).

I'm happy to write up splitting and merging using protobuf messages only and we can try to compare the understandability and ease or difficultness to implement that. But that's for later. Let revisit this issue after the first test-nets.

adlerjohn commented 2 years ago

not planned any time soon