Open ncavedale-xlabs opened 1 month ago
We have also noticed that the safe and finalized blocks are not progressing since the upgrade to v1.1.0:
Latest Block Number: 5691261
Safe Block Number: 5601958
Finalized Block Number: 5601958
It looks like the node is in pipeline sync, which syncs a large batch of blocks in the first log. How long does that take to complete?
It's been catching up since Tuesday (disk was full and we had to extend it). It was down for less than an hour.
I understand that, but how long did the pipeline run in the logs take? In the logs we have:
Oct 17 12:54:28 reth[2168334]: 2024-10-17T12:54:28.896596Z INFO Preparing stage pipeline_stages=10/14 stage=TransactionLookup checkpoint=5690093 target=5690672
There should be stages before this, and a Finish
stage to finish the cycle. It would be good to know how long this specific range of blocks took. Additionally, is there a berachain reth repo? It's possible they may need to tune some stage parameters on their end, to improve performance on a chain like berachain
the datadir is chosen based on the configured chainspec
how did you run the command and what's the chainspec/genesis.json?
I understand that, but how long did the pipeline run in the logs take? In the logs we have:
Oct 17 12:54:28 reth[2168334]: 2024-10-17T12:54:28.896596Z INFO Preparing stage pipeline_stages=10/14 stage=TransactionLookup checkpoint=5690093 target=5690672
There should be stages before this, and a
Finish
stage to finish the cycle. It would be good to know how long this specific range of blocks took. Additionally, is there a berachain reth repo? It's possible they may need to tune some stage parameters on their end, to improve performance on a chain like berachain
They don't have their own reth
version.
The pipeline ran for about 20 minutes:
Oct 17 12:38:56 reth[2168334]: 2024-10-17T12:38:56.077450Z INFO Preparing stage pipeline_stages=1/14 stage=Headers checkpoint=5690093 target=None
...
...
Oct 17 12:59:01 reth[2168334]: 2024-10-17T12:59:01.272894Z INFO Finished stage pipeline_stages=14/14 stage=Finish checkpoint=5690672 target=5690672
the datadir is chosen based on the configured chainspec
how did you run the command and what's the chainspec/genesis.json?
The command we use is:
reth node \
--chain=/var/lib/berachain/data/execution/genesis.json \
--datadir=/var/lib/berachain/data/execution \
--engine.experimental
The genesis file uses "chainId": 80084
: https://raw.githubusercontent.com/berachain/beacon-kit/main/testing/networks/80084/eth-genesis.json
I see that's why the reth db version command is crashing because if no --chain arg provided it falls back to mainnet by default
do you have the full logs for this run
Oct 17 12:38:56 reth[2168334]: 2024-10-17T12:38:56.077450Z INFO Preparing stage pipeline_stages=1/14 stage=Headers checkpoint=5690093 target=None ... ... Oct 17 12:59:01 reth[2168334]: 2024-10-17T12:59:01.272894Z INFO Finished stage pipeline_stages=14/14 stage=Finish checkpoint=5690672 target=5690672
would like to see in which stage the most time was spent
Attached is the full log for this run: full_log.txt
This issue is stale because it has been open for 21 days with no activity.
@mattsse and @Rjected , I have just upgraded one of our Ethereum nodes, from v1.0.6
to v1.1.0
The upgrade provokes the client to go through all the 12 stage pipelines, causing a huge outage of the node.
Upgraded was done 2h 10min ago and node is still on stage 2 (Committed stage progress pipeline_stages=2/12 stage=Bodies checkpoint=16580000 target=21165513 stage_progress=78.33%
).
CC: @ncavedale-xlabs
I'm now syncing a backlog of 56 days on mainnet and it's running for 10+ hours still in 9/14. I think it's far slower than what Geth would do.
@totoCZ and @andreclaro can we get exact machine specs (cpu, ram, exact disk model especially) or instance types?
2x.Gold 5120 256G RAM 8x Crucial MX500 2TB
I just checked it now and it is synced and following the chain again
Describe the bug
Hi team! We're running a testnet Berachain node, using
reth
as the execution client. We noticed that wheneverreth
service is down for just a few minutes (like 2 or 3 minutes), it takes several hours to catchup (during this time,reth
does not report block height). We're using pruning settings, to keep some history.Steps to reproduce
reth
for 5 minutes or even lessreth
and measure how long it takes to catch up againNode logs
.cache/reth/logs/80084/reth.log
:Platform(s)
Linux (x86)
What version/commit are you on?
We noticed this issue with the following version:
What database version are you on?
For some reason this seems to point to mainnet, even though this is a testnet node (mainnet has not been launched yet):
Which chain / network are you on?
Berachain testnet
What type of node are you running?
Pruned with custom reth.toml config
What prune config do you use, if any?
If you've built Reth from source, provide the full command you used
cargo build --locked --profile maxperf
Code of Conduct