block history when restarting process

nionis commented 6 years ago

If we are following the chain but then stop processing during a short-lived fork, once we restart the process the blockstream queue will be empty so it will blindly trust that the next block it sees is the main chain because it can't do any resolution since it doesn't know the blocks before that to calculate longest chain.

Are there any solutions for this?
Could we have an option to provide which block should blockstream start on so then we could simply go -10 blocks

MicahZoltu commented 6 years ago

@nionis By default, Blockstream retains 100 blocks of history. If you stop reconciling blocks for less than that, then give it latest head block it will proceed to fetch parents until it can re-link with the in-memory chain it has, then it will announce each of the blocks from oldest to newest to any listeners.

If you pause for longer than the retention limit, then it will walk backwards fetching parents until it has fetched retention blocks, at which point it will give up and rollback all of history (announcing removals along the way) and then starting over from head block.

I don't think I fully understand the problem you have/are trying to solve, but in case it helps, what Augur does is they sync manually (using much more efficient bulk syncing mechanisms) up to about block headNumber - 10. They then feed headNumber - 9 into blockstream, followed by the current head block. This will result in blockstream starting its sync from head - 9, and then fetching blocks head - 8 through head and announcing them all (including headNumber - 9) to listeners.

shrugs commented 6 years ago

sounds like Augur's solution is what we're looking for. The main scenario we're worried about is making sure that's blockstream's internal state stays consistent through restarts so that it can pick up exactly where it left off, even if it was in the middle of a fork (because gnarly still needs to know about the fork in order to revert any changes it's made)

MicahZoltu commented 6 years ago

Due to other issues with Geth/Parity, Augur also only persists data that is synced via the initdal bulk sync. Once EIP-234 is implemented in both Geth and Parity they can go back to persisting data from blockstream. This is a hamfisted solution, but it makes things slightly better.

nionis commented 6 years ago

@MicahZoltu

If you stop reconciling blocks for less than that, then give it latest head block it will proceed to fetch parents until it can re-link with the in-memory chain it has, then it will announce each of the blocks from oldest to newest to any listeners.

If you pause for longer than the retention limit, then it will walk backwards fetching parents until it has fetched retention blocks, at which point it will give up and rollback all of history (announcing removals along the way) and then starting over from head block.

just to make sure I understand correctly: In both of these cases, backfill is used to to fetch parents until it has found a "parent" block of any of the blocks stored in memory, then it will re-link and any invalid blocks will be announced, so that means that this scenario is safe:

we provide blocks to ethereumjs-blockstream
during a fork we stop providing the next latest block (we have an invalid block)
after a few minutes (wait so there are more blocks missing than our retention limit)
we start providing the latest block to ethereumjs-blockstream
ethereumjs-blockstream will fetch parent until it finds any of the blocks stored in memory
invalid block (in step 2) is detected and announced

I have been experimenting here

MicahZoltu commented 6 years ago

Correct, when a new head block is received, blockstream will check to see if the current head it has matches the parent of the new block. If it does, then we have a new head and are done. If it doesn't, then it will look at that block's parent to see if it is the parent of the new block, if not walk back again and repeat. It does this until either it finds a parent in its history or it walks off the end of its internal history (100 blocks by default). If it finds a parent then it will fire removal notifications for logs/blocks it has on top of the parent it found, and then once it has rolled back far enough it will then attach the new head.

If it walks off the end of its history, then it will fetch the new block's parent and repeat the above process again. It will do this until either it finds a way to link the two chains, or it has an entire new chain that is block_retention long (100 by default). If it finds a way to link the chains, then it will rollback blocks as above and then announce the new chain (in order). If it cannot link the chain (they diverge by more than 100 blocks) then it will rollback all of the blocks it has in history (100 by default) and then announce an entirely new chain.

I believe that last scenario is not ideal, because it actually failed to reconcile the chains. I just filed an issue to make it throw an exception in that last scenario: https://github.com/ethereumjs/ethereumjs-blockstream/issues/24 rather than claiming to have reconciled.

I'm curious about your usage scenario. How are you detecting an "invalid block" before blockstream does? Also, it feels like you would be better off not pausing and just letting blockstream do its job and deal with all of the "problems".

nionis commented 6 years ago

@MicahZoltu Actually we wont be pausing, sorry for misunderstanding, Gnarly is an ethereum indexer, and it uses ethereumjs-blockstream to make sure it keeps track of the correct blocks, those blocks are taken in by a reducer and we create a state.

The pausing is actually more like this, if Gnarly crashes / restarts at a time where the chain is invalid. We need to make sure that once it boots again, it can continue where it left off, and the invalid block is removed, ethereumjs-blockstream will not know about the invalid chain, since there was a restart, the block could have been reorganised already, this means that it will never fire to Gnarly a block removal, and we will have invalid state.

What we are thinking to do, is that we save the last 10 blocks we receive to a DB, if we have a crash / restart, we can provide those blocks to ethereumjs-blockstream and then afterwards provide ethereumjs-blockstream with the latest block, if this is done and the "gap" is below the retention limit it should be all okay.

So I think the solution for our use case is found

ethereumjs / ethereumjs-blockstream

block history when restarting process #22