paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.78k stars 638 forks source link

Sync should support fetching and importing long forks #570

Open rphmeier opened 5 years ago

rphmeier commented 5 years ago

(outdated, kept for posterity)

For GRANDPA, Casper CBC, and other chain-based consensus algorithms, we may occasionally need to download many forks which are typically small and typically not the best.

Ideally this will be via an API where it will be possible to tell the sync service which forks need downloading.


For 1.0 Gamma we added an announce_block function to NetworkService for GRANDPA nodes to announce blocks that they voted on to peers. GRANDPA voters additionally will announce periodically until the current round completes. The NetworkService announcement logic was changed to download and repropagate announcements that are for small (32 block) forks.


Further down the line we will need synchronization for arbitrarily long forks, hopefully with state pruning allowing us to keep both heads' states. Forks that connect before the finality point should be ignored.

rphmeier commented 5 years ago

cc @arkpar

arkpar commented 5 years ago

Sync currently downloads all forks that other nodes perceive as best. I've added a small test that works as expected here: paritytech/substrate#1544 Would need to examine logs to what went wrong exactly.

andresilva commented 5 years ago

Can the following happen?

I think this is why we might need to explicitly ask to sync a given chain.

rphmeier commented 5 years ago

That is what's happening, and I have a branch that reproduces: rh-test-fork-sync.

The fact that this fork might never be announced to all other peers or might be announced and then ignored is why we need the "manual" API to trigger sync to this fork. This API can be very easily invoked from substrate-finality-grandpa whenever we encounter a vote on a block we don't know.

https://github.com/paritytech/substrate/blob/f96a747776d06ed89c44fd616549aa2abfc54c68/core/network/src/test/sync.rs#L163-L189

These forks are usually small and the API might receive calls that are on the "same" fork (since all we know in the consensus layer are block hash/number -- and do its best to unify them where possible).

We also don't need to import forks from before the last finalized block, so they will usually be small.

arkpar commented 5 years ago

How would sync download a chain that was never announced? By querying random peers? This is just bad protocol design. For this to work properly all blocks must be announced.

rphmeier commented 5 years ago

It might be announced but the announcement might be missed. The author for sure has announced the block when importing but there is no guarantee that announcements propagate fully over the gossip network. You can only guarantee 1 hop -- it's possible that all immediate peers to the author have a "better" chain and thus don't re-announce.

arkpar commented 5 years ago

From what I see in the log all of the blocks are downloaded, but some fail to import with

2019-01-24 17:24:18 ImportQueue DEBUG sync  Error importing block 206989: 0xa7a0632ae042c2dbdd95bfc0185c6068c2f9de929d09af57e1421695298e16d7: Error(ClientImport("Incorrect base hash"), State { next_error: None, backtrace: InternalBacktrace { backtrace: None } })
rphmeier commented 5 years ago

In paritytech/substrate#1593 we've changed sync logic to announce and then periodically re-announce local vote blocks as well as to ensure that vote-blocks are propagating to full nodes.

rphmeier commented 5 years ago

The sync test is PR'ed here: paritytech/substrate#1604

It has an issue that paritytech/substrate#1593 doesn't address, which is that announcements of non single-length forks are ignored.

gavofyork commented 5 years ago

@rphmeier this is sorted now, right?

rphmeier commented 5 years ago

Let's keep it open and triage it to lower priority -- we have something that generally works for small forks but not perfectly for longer forks.