bnb-chain / bsc

A BNB Smart Chain client based on the go-ethereum fork
GNU Lesser General Public License v3.0
2.73k stars 1.56k forks source link

Block imported below final block #2212

Closed maoueh closed 6 months ago

maoueh commented 8 months ago

System information

Geth version: geth version 1.3.8-fh2.2-f2403f09-20240131 OS & Version: Linux

Expected behaviour

Over 2 weeks, we saw a couple of occurrences where blocks were imported below the final block of the chain like if the final block was not really final or that we import known forked block(s).

This should not happen.

Actual behaviour

Here a trimmed down version of the logs highlighted to point where block get inserted below the final block:

  Jan 30 06:21:24 Imported new chain segment number=35,689,685 hash=cde0a3..abff2a final_diff=2  final_number=35,689,683 final_hash=c7eb4b..f8943b
  Jan 30 06:21:25 Imported new chain segment number=35,689,686 hash=012eb5..d23eb2 final_diff=2  final_number=35,689,684 final_hash=6954e3..a6baf0
  Jan 30 06:21:28 Imported new chain segment number=35,689,687 hash=5b6998..26e08d final_diff=2  final_number=35,689,685 final_hash=cde0a3..abff2a
  Jan 30 06:21:47 Imported new chain segment number=35,689,692 hash=9fa6ca..b4f851 final_diff=2  final_number=35,689,690 final_hash=890f57..84281c
> Jan 30 06:21:47 Imported new chain segment number=35,689,689 hash=394647..abe33f final_diff=-1 final_number=35,689,690 final_hash=890f57..84281c
  Jan 30 06:21:47 Imported new chain segment number=35,689,693 hash=0fa713..bfa0ff final_diff=2  final_number=35,689,691 final_hash=2a57e4..61ff7e
  Jan 30 06:21:49 Imported new chain segment number=35,689,694 hash=bcda09..549091 final_diff=2  final_number=35,689,692 final_hash=9fa6ca..b4f851
  Jan 30 06:21:52 Imported new chain segment number=35,689,695 hash=60c3dc..9ac044 final_diff=2  final_number=35,689,693 final_hash=0fa713..bfa0ff
  Jan 30 06:21:56 Imported new chain segment number=35,689,696 hash=5fd4b6..1b6457 final_diff=2  final_number=35,689,694 final_hash=bcda09..549091
--
  Jan 31 21:33:38 Imported new chain segment number=35,736,648 hash=840b6f..fe7fd1 final_diff=2  final_number=35,736,646 final_hash=b4c795..afa54c
  Jan 31 21:33:49 Imported new chain segment number=35,736,651 hash=c68d1a..965800 final_diff=2  ignored=1 final_number=35,736,649 final_hash=f1566a..c86cc3
  Jan 31 21:33:50 Imported new chain segment number=35,736,652 hash=7d6998..683965 final_diff=2  final_number=35,736,650 final_hash=da0227..4b8899
  Jan 31 21:34:21 Imported new chain segment number=35,736,661 hash=1127a3..369d02 final_diff=2  ignored=1 final_number=35,736,659 final_hash=401d82..db2414
> Jan 31 21:34:21 Imported new chain segment number=35,736,658 hash=c028e5..baec51 final_diff=-1 final_number=35,736,659 final_hash=401d82..db2414
  Jan 31 21:34:21 Imported new chain segment number=35,736,662 hash=fc65df..2fe86e final_diff=2  final_number=35,736,660 final_hash=74cd1b..b208f9
  Jan 31 21:34:24 Imported new chain segment number=35,736,663 hash=9e928a..ec0880 final_diff=2  final_number=35,736,661 final_hash=1127a3..369d02
  Jan 31 21:34:27 Imported new chain segment number=35,736,664 hash=8215f8..1101c8 final_diff=2  final_number=35,736,662 final_hash=fc65df..2fe86e
  Jan 31 21:34:29 Imported new chain segment number=35,736,665 hash=2d6a0c..5d70c0 final_diff=2  final_number=35,736,663 final_hash=9e928a..ec0880
--
  Feb  1 03:43:22 Imported new chain segment number=35,744,036 hash=18bbd6..d4a2ef final_diff=2  final_number=35,744,034 final_hash=197aeb..70ac62
  Feb  1 03:43:25 Imported new chain segment number=35,744,037 hash=a0e5e3..9b1717 final_diff=2  final_number=35,744,035 final_hash=53cc03..530e8f
  Feb  1 03:43:27 Imported new chain segment number=35,744,038 hash=7fae72..bf9bb7 final_diff=2  final_number=35,744,036 final_hash=18bbd6..d4a2ef
  Feb  1 03:43:43 Imported new chain segment number=35,744,043 hash=8adbb2..516428 final_diff=2  ignored=1 final_number=35,744,041 final_hash=57a1ed..15902d
> Feb  1 03:43:43 Imported new chain segment number=35,744,040 hash=676a5a..c64b91 final_diff=-1 final_number=35,744,041 final_hash=57a1ed..15902d
  Feb  1 03:43:46 Imported new chain segment number=35,744,044 hash=5e2bd4..d215d2 final_diff=2  final_number=35,744,042 final_hash=4cde83..8f7f52
  Feb  1 03:44:01 Imported new chain segment number=35,744,049 hash=4e2929..caaf85 final_diff=2  ignored=1 final_number=35,744,047 final_hash=a6d72d..d45ac2
  Feb  1 03:44:14 Imported new chain segment number=35,744,053 hash=c21d13..447765 final_diff=2  final_number=35,744,051 final_hash=eb5971..c797bf
  Feb  1 03:44:15 Imported new chain segment number=35,744,054 hash=b56ef2..854eca final_diff=2  final_number=35,744,052 final_hash=27e8fc..7f35b1
--
  Feb  1 03:47:24 Imported new chain segment number=35,744,116 hash=c4874a..36afd6 final_diff=2  final_number=35,744,114 final_hash=654895..269d57
  Feb  1 03:47:25 Imported new chain segment number=35,744,117 hash=03d077..358571 final_diff=2  final_number=35,744,115 final_hash=990d2f..891338
  Feb  1 03:47:27 Imported new chain segment number=35,744,118 hash=f0125f..29a48b final_diff=2  final_number=35,744,116 final_hash=c4874a..36afd6
  Feb  1 03:47:58 Imported new chain segment number=35,744,127 hash=33c855..e46f30 final_diff=2  final_number=35,744,125 final_hash=d20eaf..b7cd34
> Feb  1 03:47:58 Imported new chain segment number=35,744,122 hash=02e7e2..c2de61 final_diff=-3 final_number=35,744,125 final_hash=d20eaf..b7cd34
  Feb  1 03:48:23 Imported new chain segment number=35,744,136 hash=c1319a..819697 final_diff=2  ignored=1 final_number=35,744,134 final_hash=46ccd5..c7699d
  Feb  1 03:48:25 Imported new chain segment number=35,744,137 hash=53fe72..551d09 final_diff=2  final_number=35,744,135 final_hash=b09e3e..278ad7
  Feb  1 03:48:28 Imported new chain segment number=35,744,138 hash=9902e2..7edde1 final_diff=2  final_number=35,744,136 final_hash=c1319a..819697
  Feb  1 03:48:31 Imported new chain segment number=35,744,139 hash=3900ca..d4cf64 final_diff=2  final_number=35,744,137 final_hash=53fe72..551d09
--
  Feb  6 04:32:33 Imported new chain segment number=35,888,743 hash=3d27a5..d5eb0d final_diff=2  final_number=35,888,741 final_hash=c29afa..c2eb55
  Feb  6 04:32:36 Imported new chain segment number=35,888,744 hash=188afd..4c4d01 final_diff=2  final_number=35,888,742 final_hash=d3bef2..0bfaac
  Feb  6 04:32:39 Imported new chain segment number=35,888,745 hash=318417..64d6dd final_diff=2  final_number=35,888,743 final_hash=3d27a5..d5eb0d
  Feb  6 04:33:13 Imported new chain segment number=35,888,755 hash=6cc8be..0d54ca final_diff=2  final_number=35,888,753 final_hash=b6acc0..f89a49
> Feb  6 04:33:13 Imported new chain segment number=35,888,750 hash=812f1c..d8608b final_diff=-3 final_number=35,888,753 final_hash=b6acc0..f89a49
  Feb  6 04:33:14 Imported new chain segment number=35,888,756 hash=bc66bf..29c6c6 final_diff=2  final_number=35,888,754 final_hash=6691a2..cfde1e
  Feb  6 04:33:17 Imported new chain segment number=35,888,757 hash=3bc19b..599041 final_diff=2  final_number=35,888,755 final_hash=6cc8be..0d54ca
  Feb  6 04:33:19 Imported new chain segment number=35,888,758 hash=805d41..19c595 final_diff=2  final_number=35,888,756 final_hash=bc66bf..29c6c6
  Feb  6 04:33:21 Imported new chain segment number=35,888,759 hash=4dced3..d1e657 final_diff=2  final_number=35,888,757 final_hash=3bc19b..599041

[!NOTE] See full untrimmed logs

The final_diff, final_number and final_hash were added to the Imported new chain segment line with this code:

# core/blockchain.go @ L2113
stats.report(chain, it.index, trieDiffNodes, trieBufNodes, trieImmutableBufNodes, setHead, bc.CurrentFinalBlock())

[!NOTE] See https://github.com/bnb-chain/bsc/blob/v1.3.8/core/blockchain.go#L2113

And

# core/blockchain_insert.go @ L80-L82
        if setHead {
            if final != nil {
                context = append(context, []interface{}{"final_diff", end.Number()-final.Number, "final_number", final.Number, "final_hash", final.Hash()}...)
            }
            log.Info("Imported new chain segment", context...)
        } 

[!NOTE] See https://github.com/bnb-chain/bsc/blob/v1.3.8/core/blockchain_insert.go#L80-L82

Steps to reproduce the behaviour

As you can see from the logs, this is not something that is happening quite often but still produces at regular interval so far. This is a 2 two weeks sample above but it's been at least 6 weeks maybe more since we started to see this problem.

We can run modified version with extra logging if you desire and extract data for you if you want.

MatusKysel commented 8 months ago

Hey @maoueh thanks for very detailed report, I think you found a bug in our batch header verifier. I will let you know once it's fixed

buddh0 commented 8 months ago

for now, we have not used final block to limit importing block so block below latest final block can be imported, but just as sidechain we will do it as a optimization later, thank you @maoueh

I'm sure the reorg logic is not broken by these useless imported blocks, for example, we can check blocks corresponds to logs provided by you block 35689689, hash is 0x69e20876c5c7c1cf76f903a25d65b6aa3e5ab7ef2b67ab332eb2c8f081949b98, not 394647..abe33f in your log block 35736658 block 35744122 block 35888750

maoueh commented 8 months ago

Yes I don't think it's a problem for the node itself, like you mentioned, every block imported below the final block were a forked block, block which would have been discarded quickly, which I think you refer as "sidechain".

However, this is a problem for our Firehose system. Indeed, once the node determine the block was final, receiving a block before causes problem later on since we cannot link back those block below the final block (because we trim any data for which height <= final height.

We actually for now fix it in our fork by checking if the imported block is in a sidechain and kind of ignore it if not canonical.

we will do it as a optimization later

Thanks, going to monitor this thread so let me know.

buddh0 commented 8 months ago

Yes I don't think it's a problem for the node itself, like you mentioned, every block imported below the final block were a forked block, block which would have been discarded quickly, which I think you refer as "sidechain".

However, this is a problem for our Firehose system. Indeed, once the node determine the block was final, receiving a block before causes problem later on since we cannot link back those block below the final block (because we trim any data for which height <= final height.

We actually for now fix it in our fork by checking if the imported block is in a sidechain and kind of ignore it if not canonical.

we will do it as a optimization later

Thanks, going to monitor this thread so let me know.

It will be fixed in one month or three weeks

zzzckck commented 7 months ago

fixed, changed the log as it tries to import the block as sidechain.

maoueh commented 7 months ago

@zzzckck @buddh0 I read the associated PR and I don't see how it fix the root problem.

It appears the PR only removes the reporting that those block was added but they are still processed ultimately by the state_processor.go leading to block being traced while they are below the final block.

Was there other code changes prior to the PR that now prevents blocks below finalized block being imported at all?

zzzckck commented 7 months ago

@zzzckck @buddh0 I read the associated PR and I don't see how it fix the root problem.

It appears the PR only removes the reporting that those block was added but they are still processed ultimately by the state_processor.go leading to block being traced while they are below the final block.

Was there other code changes prior to the PR that now prevents blocks below finalized block being imported at all?

@maoueh The root cause is that, the previous log is a little bit misleading, it actually tries to import sidechain block, but the printed messaged did not mark the sidechain pattern. Take the bellow log for example:

  Jan 31 21:34:21 Imported new chain segment number=35,736,661 hash=1127a3..369d02 final_diff=2  ignored=1 final_number=35,736,659 final_hash=401d82..db2414
> Jan 31 21:34:21 Imported new chain segment number=35,736,658 hash=c028e5..baec51 final_diff=-1 final_number=35,736,659 final_hash=401d82..db2414
  Jan 31 21:34:21 Imported new chain segment number=35,736,662 hash=fc65df..2fe86e final_diff=2  final_number=35,736,660 final_hash=74cd1b..b208f9

block 35,736,658 is sidechain block, so we only change the misleading log, no other PRs

maoueh commented 7 months ago

It's "wrong" to import a side chain that hasn't any chance to become the canonical block(s). Indeed, if a side chain is imported for which side chain block number is < finalized block num, this is useless completely since this side chain will never make it as the canonical version of the chain (otherwise we have a bigger problem with finality not being respected).

My report was exactly as what I just described above. While I do agree that is not incorrect for the node to import those side chains, it's will become a problem for live tracer such what our Firehose system is doing.

For a bit of context, live extended tracer is something that we contributed with the help of the Geth to the geth codebase directly. It is going to be part of Geth 1.14 which at some point will make to BSC.

Now, those live tracer usually needs to deals with fork block. In our Firehose live tracer, we deal with forks and the handling is done by providing the finality signal to the live tracer (see https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L112 and https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L66).

Now if the tracer receives forked block (side chain) that are actually below the last finalized block, than this is incorrect and the chain is making something that doesn't make sense.

It could be argue that this would be something just for "live tracer" and I could deal with there. But I oppose the counter-argument that wasting time importing useless side chain is something is more a performance optimization that benefits anyone syncing.

This is something that received some attention in the past, from some previous issue we opened last year:

This worked for a while but seems some edge cases still happen today. I hope this add the necessary details so that a correct "fix" can be implemented. Happy to jump on a call if further details is needed.

maoueh commented 6 months ago

@zzzckck @buddh0 Just checking if https://github.com/bnb-chain/bsc/issues/2212#issuecomment-2046493660 has been discussed internally?

zzzckck commented 6 months ago

It's "wrong" to import a side chain that hasn't any chance to become the canonical block(s). Indeed, if a side chain is imported for which side chain block number is < finalized block num, this is useless completely since this side chain will never make it as the canonical version of the chain (otherwise we have a bigger problem with finality not being respected).

My report was exactly as what I just described above. While I do agree that is not incorrect for the node to import those side chains, it's will become a problem for live tracer such what our Firehose system is doing.

For a bit of context, live extended tracer is something that we contributed with the help of the Geth to the geth codebase directly. It is going to be part of Geth 1.14 which at some point will make to BSC.

Now, those live tracer usually needs to deals with fork block. In our Firehose live tracer, we deal with forks and the handling is done by providing the finality signal to the live tracer (see https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L112 and https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L66).

Now if the tracer receives forked block (side chain) that are actually below the last finalized block, than this is incorrect and the chain is making something that doesn't make sense.

It could be argue that this would be something just for "live tracer" and I could deal with there. But I oppose the counter-argument that wasting time importing useless side chain is something is more a performance optimization that benefits anyone syncing.

This is something that received some attention in the past, from some previous issue we opened last year:

This worked for a while but seems some edge cases still happen today. I hope this add the necessary details so that a correct "fix" can be implemented. Happy to jump on a call if further details is needed.

@maoueh I agree, you are right. We don't need to import side-chain blocks below finalized height. I will reopen this issue and we can fix it.

buddh0 commented 6 months ago

@maoueh pls try the latest develop branch

maoueh commented 6 months ago

@buddh0 @zzzckck Thanks folks!

Once it's in a release, we will monitor closely and report. It will take time as it was not happening often, but you can be confident that I'll report back at some point.