Closed maoueh closed 6 months ago
Hey @maoueh thanks for very detailed report, I think you found a bug in our batch header verifier. I will let you know once it's fixed
for now, we have not used final block to limit importing block so block below latest final block can be imported, but just as sidechain we will do it as a optimization later, thank you @maoueh
I'm sure the reorg logic is not broken by these useless imported blocks, for example, we can check blocks corresponds to logs provided by you block 35689689, hash is 0x69e20876c5c7c1cf76f903a25d65b6aa3e5ab7ef2b67ab332eb2c8f081949b98, not 394647..abe33f in your log block 35736658 block 35744122 block 35888750
Yes I don't think it's a problem for the node itself, like you mentioned, every block imported below the final block were a forked block, block which would have been discarded quickly, which I think you refer as "sidechain".
However, this is a problem for our Firehose system. Indeed, once the node determine the block was final, receiving a block before causes problem later on since we cannot link back those block below the final block (because we trim any data for which height <= final height
.
We actually for now fix it in our fork by checking if the imported block is in a sidechain and kind of ignore it if not canonical.
we will do it as a optimization later
Thanks, going to monitor this thread so let me know.
Yes I don't think it's a problem for the node itself, like you mentioned, every block imported below the final block were a forked block, block which would have been discarded quickly, which I think you refer as "sidechain".
However, this is a problem for our Firehose system. Indeed, once the node determine the block was final, receiving a block before causes problem later on since we cannot link back those block below the final block (because we trim any data for which
height <= final height
.We actually for now fix it in our fork by checking if the imported block is in a sidechain and kind of ignore it if not canonical.
we will do it as a optimization later
Thanks, going to monitor this thread so let me know.
It will be fixed in one month or three weeks
fixed, changed the log as it tries to import the block as sidechain.
@zzzckck @buddh0 I read the associated PR and I don't see how it fix the root problem.
It appears the PR only removes the reporting that those block was added but they are still processed ultimately by the state_processor.go
leading to block being traced while they are below the final block.
Was there other code changes prior to the PR that now prevents blocks below finalized block being imported at all?
@zzzckck @buddh0 I read the associated PR and I don't see how it fix the root problem.
It appears the PR only removes the reporting that those block was added but they are still processed ultimately by the
state_processor.go
leading to block being traced while they are below the final block.Was there other code changes prior to the PR that now prevents blocks below finalized block being imported at all?
@maoueh The root cause is that, the previous log is a little bit misleading, it actually tries to import sidechain block, but the printed messaged did not mark the sidechain pattern. Take the bellow log for example:
Jan 31 21:34:21 Imported new chain segment number=35,736,661 hash=1127a3..369d02 final_diff=2 ignored=1 final_number=35,736,659 final_hash=401d82..db2414
> Jan 31 21:34:21 Imported new chain segment number=35,736,658 hash=c028e5..baec51 final_diff=-1 final_number=35,736,659 final_hash=401d82..db2414
Jan 31 21:34:21 Imported new chain segment number=35,736,662 hash=fc65df..2fe86e final_diff=2 final_number=35,736,660 final_hash=74cd1b..b208f9
block 35,736,658 is sidechain block, so we only change the misleading log, no other PRs
It's "wrong" to import a side chain that hasn't any chance to become the canonical block(s). Indeed, if a side chain is imported for which side chain block number is < finalized block num
, this is useless completely since this side chain will never make it as the canonical version of the chain (otherwise we have a bigger problem with finality not being respected).
My report was exactly as what I just described above. While I do agree that is not incorrect for the node to import those side chains, it's will become a problem for live tracer such what our Firehose system is doing.
For a bit of context, live extended tracer is something that we contributed with the help of the Geth to the geth
codebase directly. It is going to be part of Geth 1.14 which at some point will make to BSC.
Now, those live tracer usually needs to deals with fork block. In our Firehose live tracer, we deal with forks and the handling is done by providing the finality signal to the live tracer (see https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L112 and https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L66).
Now if the tracer receives forked block (side chain) that are actually below the last finalized block, than this is incorrect and the chain is making something that doesn't make sense.
It could be argue that this would be something just for "live tracer" and I could deal with there. But I oppose the counter-argument that wasting time importing useless side chain is something is more a performance optimization that benefits anyone syncing.
This is something that received some attention in the past, from some previous issue we opened last year:
This worked for a while but seems some edge cases still happen today. I hope this add the necessary details so that a correct "fix" can be implemented. Happy to jump on a call if further details is needed.
@zzzckck @buddh0 Just checking if https://github.com/bnb-chain/bsc/issues/2212#issuecomment-2046493660 has been discussed internally?
It's "wrong" to import a side chain that hasn't any chance to become the canonical block(s). Indeed, if a side chain is imported for which
side chain block number is < finalized block num
, this is useless completely since this side chain will never make it as the canonical version of the chain (otherwise we have a bigger problem with finality not being respected).My report was exactly as what I just described above. While I do agree that is not incorrect for the node to import those side chains, it's will become a problem for live tracer such what our Firehose system is doing.
For a bit of context, live extended tracer is something that we contributed with the help of the Geth to the
geth
codebase directly. It is going to be part of Geth 1.14 which at some point will make to BSC.Now, those live tracer usually needs to deals with fork block. In our Firehose live tracer, we deal with forks and the handling is done by providing the finality signal to the live tracer (see https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L112 and https://github.com/ethereum/go-ethereum/blob/master/core/tracing/hooks.go#L66).
Now if the tracer receives forked block (side chain) that are actually below the last finalized block, than this is incorrect and the chain is making something that doesn't make sense.
It could be argue that this would be something just for "live tracer" and I could deal with there. But I oppose the counter-argument that wasting time importing useless side chain is something is more a performance optimization that benefits anyone syncing.
This is something that received some attention in the past, from some previous issue we opened last year:
- Reorgs cause "old forked blocks" (below the reorg junction block) to be inserted/processed #1847
- fetcher: no import blocks before or equal to the finalized height #1854
This worked for a while but seems some edge cases still happen today. I hope this add the necessary details so that a correct "fix" can be implemented. Happy to jump on a call if further details is needed.
@maoueh I agree, you are right. We don't need to import side-chain blocks below finalized height. I will reopen this issue and we can fix it.
@maoueh pls try the latest develop branch
@buddh0 @zzzckck Thanks folks!
Once it's in a release, we will monitor closely and report. It will take time as it was not happening often, but you can be confident that I'll report back at some point.
System information
Geth version:
geth version 1.3.8-fh2.2-f2403f09-20240131
OS & Version: LinuxExpected behaviour
Over 2 weeks, we saw a couple of occurrences where blocks were imported below the final block of the chain like if the final block was not really final or that we import known forked block(s).
This should not happen.
Actual behaviour
Here a trimmed down version of the logs highlighted to point where block get inserted below the final block:
The
final_diff
,final_number
andfinal_hash
were added to theImported new chain segment
line with this code:And
Steps to reproduce the behaviour
As you can see from the logs, this is not something that is happening quite often but still produces at regular interval so far. This is a 2 two weeks sample above but it's been at least 6 weeks maybe more since we started to see this problem.
We can run modified version with extra logging if you desire and extract data for you if you want.