Closed liamsi closed 4 years ago
Current Block
and Header
:
https://github.com/LazyLedger/lazyledger-core/blob/da745371227f54aa90c609845cd4cc2f36a152f1/types/block.go#L38-L44
https://github.com/LazyLedger/lazyledger-core/blob/da745371227f54aa90c609845cd4cc2f36a152f1/types/block.go#L323-L348
We can use the existing Data
field for what is called availableData
in the current LazyLedger spec.
Current Data type: https://github.com/LazyLedger/lazyledger-core/blob/da745371227f54aa90c609845cd4cc2f36a152f1/types/block.go#L809-L819
Changes to the Block are simple. Basically, move over Evidence
field into Data
as it needs availability too. And add the availableDataHeader
:
type Block struct {
Header
AvailableDataHeader [][]byte
Data
LastCommit *Commit
}
The Header does not need to change at all 🎉 (for now at least)
There are some fields that can be removed if we switch to immediate execution.
As this proposal tries to keep the changes as small and contained as possible,
we will keep the field names as well as some fields used in the current
tendermint (light client) verification logic. With the same reasoning we keep EvidenceHash
in the Header.
I added NOTE
s where this deviates from the spec (cc @adlerjohn).
type Header struct {
Version version.Consensus
ChainID string
Height int64
Time time.Time
LastBlockID BlockID
LastCommitHash tmbytes.HexBytes
// NOTE: becomes available Data root
// (not renaming for now)
DataHash
// NOTE: keep the current hashes / pointers
// to keep the current light client logic intact
// (will address removal in separate issues):
ValidatorsHash tmbytes.HexBytes // validators for the current block
NextValidatorsHash tmbytes.HexBytes // validators for the next block
ConsensusHash tmbytes.HexBytes // consensus params for current block
// NOTE: different from the current spec!
// this is not the `stateCommitment` for the
// Tx in this block but for the Tx of the previous block:
AppHash tmbytes.HexBytes // state after txs from the previous block
// NOTE: this field is part of the deferred
// execution model
// (DeliverTx results on the previous blocks merkelized)
LastResultsHash tmbytes.HexBytes
// NOTE: this is kinda duplicate now because the `availableDataRoot`
//basically includes a commitment to the evidence data
// (will address removal in a separate issue)
EvidenceHash tmbytes.HexBytes // evidence included in the block
ProposerAddress Address // original proposer of the block
}
Data will contain everything that requires data availability. Adding fields does not cause any trouble with the current logic. The semantic of that hash changes though (currently merkle root of all fields).
Moving over evidence should be a simple change (and could be done separately later).
The newly introduced IntermediateStateRoots
can probably be computed using
the ResponseDeliverTx.Data
without changing the current logic inside tendermint at all.
Note: The only thing I'm not sure yet is how to handle the separation of
"Transactions" and "Message" in the spec's sense. ~~Maybe we can also include this
in ResponseDeliverTx.Data
: executing a Tx returns the intermediate state root
associated with that Tx as well as the corresponding message.~~
Maybe we can include this in the response of checkTx in ResponseCheckTx.Data
(DeliverTx would be one off again). Note that tendermint does not know of the distinction we make between Transaction
and Message
. The way they would be sent to a node would be in a single (network) message / (tendermint) Tx. Somewhere these need to be split up to be encoded in the block as described in the spec. At least if we don't want to add another sperate pool for messages and a separate reactor.
For the other fields in data we can hard code namespace ids.
type Data struct {
// Txs that will be applied by state @ block.Height+1.
// NOTE: not all txs here are valid. We're just agreeing on the order first.
// This means that block.AppHash does not include these txs.
Txs Txs
// Note: As Tendermint does not execute the
// Tx itself (the app does), we need a way to
// bring compute these intermediate state roots.
// Simply using the `Data` field in `ResponseDeliverTx`
// for this looks promising.
IntermediateStateRoots []tmbytes.HexBytes
// NOTE: moved over from Block
Evidence EvidenceData
// Note: Tendermint Tx are just blobs. Tendermint core itself does
// not understand how to interpret a Tx or it's validity.
// That is part of the App logic.
//
// Maybe we do not need a separate message field
// and can leverage the generality of Tx in
// tendermint instead?
Messages []Message // TODO: This needs more thought!
// Volatile
hash tmbytes.HexBytes
}
type Message struct {
// TODO: figure out constrains and
// introduce dedicated type instead of just []byte
// namespace id is necessary as this dictates where
// in the namespace merkle tree this will go.
NamespaceID []byte
Data []byte
}
This seems reasonable. The availableDataHeader
(which contains a matrix of commitments to the data in RSMT2D form) looks like it's missing from the block though.
Note: The only thing I'm not sure yet is how to handle the separation of "Transactions" and "Message" in the spec's sense.
For how they're encoded in the commitment(s), I was thinking reserve different namespace IDs for 1) transactions 2) intermediate state roots 3) evidence. Then Messages can have any other namespace ID.
For how they're handled at the networking layer, yeah you're right it looks like it'll take a bit of effort to make changes required to pass them around as a single blob then split them up. But shouldn't be unreasonably difficult (I hope!).
The availableDataHeader (which contains a matrix of commitments to the data in RSMT2D form) looks like it's missing from the block though.
Ah, thanks. Good catch!
For how they're handled at the networking layer, yeah you're right it looks like it'll take a bit of effort to make changes required to pass them around as a single blob then split them up. But shouldn't be unreasonably difficult (I hope!).
Exactly, not yet sure how to handle this part. I'm confident we can use checkTx for that. Haven't thought that through yet though.
Regarding the availableDataHeader
: does it matter to you, if the underlying data type be []byte
like in the spec or [][]byte
for the row/colums semantics? We should align the spec here too (but I guess that was on your radar anyways).
does it matter to you, if the underlying data type be
[]byte
like in the spec or[][]byte
for the row/columns semantics?
Is there any inherent downside to using a 2D matrix representation instead of a 1D array? I guess with the 2D matrix you'd need to specify two indices all the time. It would also be more work to Merkleize a matrix instead of an array (maybe not, if the iterator of the matrix is implemented nicely). Given that, I'd lean towards using a 1D array, and just doing pointer arithmetic to map {row, col} <-> {index}
. Thoughts?
It would also be more work to Merkleize a matrix instead of an array (maybe not, if the iterator of the matrix is implemented nicely)
Not really. An iterator as you mentioned is kinda equivalent to that (row, col) -> index bijection you mentioned. I'm leaning towards 2-dim (for the encoding) and defining a type AvailableDataHeader
that hides away that detail (in the implementation) and returns an iterator / ordering and methods ( RowRoot(i), ColumnRoot(j)` or whatever they need to be accessed). In the end it doesn't matter too much.
Fair enough. Will there be any serialization issue with matrices? If not then 2D is fine. Will be easier to reason about too.
Will there be any serialization issue with matrices?
Not that I'm aware of. For binary/proto seriliazation this will be:
message Block {
// [...]
repeated bytes AvailableDataHeader = 5; // or whatever field number
}
Or we can flatten the matrix for serialization (and have it only bytes) basically the way you described.
Updated above
Ugh, it looks like protobuf doesn't have native support for multidimensional arrays. So it would have to be encoded as a 1D array for serialization. It doesn't actually matter how the Go client stores this internally, so long as it serializes it properly.
It doesn't actually matter how the Go client stores this internally, so long as it serializes it properly.
True, still would like the spec and the code match as much as possible (where it makes sense).
This issue drafts changes of a first iteration of modifying the block header to support data availability proofs.
Basically be a variant of what we spec'ed out in https://github.com/LazyLedger/lazyledger-specs: with as little changes as possible to tendermint core.
As a side note, we should track and document theses decisions and changes in the repository itself. We can track them in the code directly via so called ADRs (we used these at tendermint: https://github.com/tendermint/tendermint/tree/master/docs/architecture).
Note that the spec in it's current form expects immediate execution (ref: #3) which doesn't really affect this issue besides that the mentioned intermediate state roots included in block of height h+1 reflects the state induced by transactions included in block of height h (as described in https://github.com/LazyLedger/lazyledger-core/issues/3#issuecomment-614530279).