mimblewimble / grin

Minimal implementation of the Mimblewimble protocol.
https://grin.mw/
Apache License 2.0
5.04k stars 990 forks source link

Compact Block (Partial Hydration and Recovery) #2888

Open antiochp opened 5 years ago

antiochp commented 5 years ago

Not super high priority but something we probably want to think about over the next few months.

We broadcast txs "header first", followed by "compact blocks". A compact blocks consists of a header and full info for the coinbase output and kernel, and "short_ids" for each additional tx kernel (and associated nonce for these ids).

The receiving node can use the short_ids (and nonce) to check their local txpool for the necessary txs. If the node can rehydrate the compact block based on txs in the txpool then they can process it and do not need to see the full block.

The aim of compact blocks is to minimize redundant data on the network - we do not need to receive tx data twice, once as a tx and again as part of a full block. So we can receive it once as a tx, then reconstruct the full block based on the short_ids in the compact block.

The short_ids are really short (6 bytes each) and connection (peer) specific based on a nonce included in the compact block. This allows compact blocks to be compact while minimizing id collisions (intentional or otherwise).

The problem that this issue seeks to address is simple - compact blocks are an "all or nothing" implementation right now. The receiving node can either fully hydrate a compact block or it cannot. If it cannot it falls back to requesting the full block. No attempt is made to fill in the gaps.

So if we have a situation where a blocks contains 100 tx kernels and the receiving node can hydrate 99 of those into txs from the txpool, but is still missing a single tx it fails hydration and requests the full block.


Proposed solution -

If a node receives a compact block and is missing some number of kernels and is unable to hydrate the block fully - ask the sending node for the missing txs.

I think we can potentially do this by sending back an optional vec of short_ids and nonce when requesting the block. If the original sender can determine the tx(s) based on these short_ids then it can send them over, followed by a new version of the compact block.

Sending the compact block over a 2nd time avoids the receiving node from needing to maintain any additional state.

-> compact block (short_ids and nonce)
<- request block (with the missing short_ids and nonce)
-> single aggregate tx (based on what was determined to be missing)
-> compact block (short_ids, new nonce)

These short_ids and nonce are optional. Receiving node can decide to skip them and request full block. If short_ids not included then simply send the full block. Even if short_ids are included then node is free to simply send the full block.

There are likely cases where it makes more sense to request the full block regardless - say the receiving node identifies all or the majority of the txs as missing.

Likewise for the sending node - if it is cheaper to simply send the full block (vs. retrieving many missing txs) then it can do so.

The sending node can leverage its local "reorg_cache" which contains recently processed txs when retrieving txs identified as missing by the receiving node.

Receiving node looks in its txpool to hydrate the compact block based on short_ids. Sending node looks in its local reorg_cache to identify missing txs based on short_ids.

garyyu commented 5 years ago

to minimize redundant data on the network

👍

Not directly related to this issue, but this make me think the status of redundant txs receiving for the current p2p protocol, need an investigation about that.