Closed e1ghtball closed 3 months ago
@e1ghtball Hi, the log in op-node: pre fetch receipts fail without error, need retry
is not important, you can notice that its level is debug, which means that the pre-fetch receipts cache is full, so the pre-fetch work cannot continue and needs to be retried constantly. At the same time, the Peer's transaction rejected
in op-geth is also not a big problem, it means that some transactions broadcasted by peers are rejected due to "insufficient funds for gas * price + value" reasons.
These logs can also appear in normal nodes. In the logs you provided, only one line is abnormal: lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=153
, which means that the derivation.step process of the op-node has failed 153 times. The specific reason for the failure have been printed in an error or warn log in a place that you did not provide. You need to provide more op-node logs so that we can determine where the problem occurred.
@welkin22 is it correct part of logs?
t=2024-03-17T10:37:49+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xfe27a1d1b36b277c82e6abbd2b42b18436ac74d60187dbe9f39e90754d3b0974 blockNumber=36,748,957
t=2024-03-17T10:37:49+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xfa267d23ed5467de28f2b9faf2cdff4518b18e428c5d8f1f53db2307cef46790 blockNumber=36,748,952
t=2024-03-17T10:37:49+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x518fd71216ac40f04b5f407b3823f835d1abc992ae871173833ab5c4bb484885 blockNumber=36,748,959
t=2024-03-17T10:37:49+0000 lvl=eror msg="Payload execution failed" block_hash=0x2a43355734071576a72c703f7b795cb2732afbf79edbd04618eb8c4171d76744 err="Post \"http://localhost:8551\": context deadline exceeded"
t=2024-03-17T10:37:49+0000 lvl=warn msg="Derivation process temporary error" attempts=1 err="engine stage failed: temp: failed to update insert payload: failed to execute payload: Post \"http://localhost:8551\": context deadline exceeded"
t=2024-03-17T10:37:49+0000 lvl=dbug msg="scheduling re-attempt with delay" attempts=1 delay=2.04s
t=2024-03-17T10:37:49+0000 lvl=dbug msg="L1 head moved forward" l1_head=0x8e8135333146af863d3c048aad0321c595eb53c05acbbb813c4c2305a4a26a5f:37045601
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:49+0000 lvl=dbug msg="requesting missing unsafe L2 block range" start=0x296d6788275f2c8d464f1420770b732cf0a811d4faecccfb2d591c2f648c7406:18176880 end=0x2a43355734071576a72c703f7b795cb2732afbf79edbd04618eb8c4171d76744:18918141 size=741,261
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring request to sync L2 range, timestamp is too old for p2p" start=0x296d6788275f2c8d464f1420770b732cf0a811d4faecccfb2d591c2f648c7406:18176880 end=0x2a43355734071576a72c703f7b795cb2732afbf79edbd04618eb8c4171d76744:18918141 start_time=1,709,930,603
t=2024-03-17T10:37:49+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0x3606840e4fd878d440c23e4258d0250f04fd23617906ff6844a26e3afc794927:18918142
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:49+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xecd1a5b5e3a3493a9e608a6cf206bede030fb70955ff5eef625c6fda3a182c0c:18918143
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:49+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xbb4c1fef7d8f429ac9db432685f19b15253245ee329971e70fdb59d02e4f418b:18918144
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:49+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xe5291ea7a79e95e542b5c5823bcd61ae25b7c396c566f1a966f6d5d1768feb46:18918145
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:49+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xc6a2926aae0ccffa09b55de31e4553d6112ea591efbe3dc65759a1010946450e:18918146
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:49+0000 lvl=dbug msg="L1 head moved forward" l1_head=0x62ffce8981a45fa593ef9dfd23d17fd4a69ff7e833ac1df8f3bd03c223f933af:37045602
t=2024-03-17T10:37:49+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:49+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xe696ba2655a3c95fe80e4eb005d633ba73783ddff137079a731db5106fb81725 blockNumber=36,748,961
t=2024-03-17T10:37:49+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xbf534a584a04455ae494693499aec5ffb426d1d15b2ddf0e9f25d288ba380c39 blockNumber=36,748,956
t=2024-03-17T10:37:49+0000 lvl=dbug msg="Revalidated node" p2p=discv5 b=15 id=8b59a309783df517e66695d493f75c0ddb15d7c44355c6fec020366a6ece047e checks=2
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x66edcb0ac875de8110164edf77bd7b86871f2c7bf2040e7dde0166fa38f24f00 blockNumber=36,748,960
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x21853c48a1f134f213e8db0e8c5caf63b802eefbd2c1f18ac3ab26984dd9cbdf blockNumber=36,748,958
t=2024-03-17T10:37:50+0000 lvl=dbug msg="peering tick" connected=28 advertised_udp=9003 advertised_tcp=9003 advertised_ip=23.153.72.203
t=2024-03-17T10:37:50+0000 lvl=dbug msg="validated block" result=ACCEPT from=16Uiu2HAm842xSEbJhpL1ZSebb3yft9C3zLnrQpb3d4idgi1bLNzY
t=2024-03-17T10:37:50+0000 lvl=info msg="Received signed execution payload from p2p" id=0xc9c653059db74d7a95e6273bb05629b78e1e2bae8994ab622e81899cae3a50c4:18918147 peer=16Uiu2HAm842xSEbJhpL1ZSebb3yft9C3zLnrQpb3d4idgi1bLNzY
t=2024-03-17T10:37:50+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xc9c653059db74d7a95e6273bb05629b78e1e2bae8994ab622e81899cae3a50c4:18918147
t=2024-03-17T10:37:50+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xdfbdcdc53ca64adae5b490639b087e61915bdca7b88e334fab6e016ced3bc5a7 blockNumber=36,748,953
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x79ded6f894377dd27f3ccedd61c9fd13ed3ecc33b8ed878de95e9bda57770b33 blockNumber=36,748,955
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x45dc47a8d3a03f60813fcf213e4199740df5775d9e9cf4e8082c42819369513f blockNumber=36,748,954
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xfe27a1d1b36b277c82e6abbd2b42b18436ac74d60187dbe9f39e90754d3b0974 blockNumber=36,748,957
t=2024-03-17T10:37:50+0000 lvl=dbug msg="requesting missing unsafe L2 block range" start=0x296d6788275f2c8d464f1420770b732cf0a811d4faecccfb2d591c2f648c7406:18176880 end=0x2a43355734071576a72c703f7b795cb2732afbf79edbd04618eb8c4171d76744:18918141 size=741,261
t=2024-03-17T10:37:50+0000 lvl=dbug msg="ignoring request to sync L2 range, timestamp is too old for p2p" start=0x296d6788275f2c8d464f1420770b732cf0a811d4faecccfb2d591c2f648c7406:18176880 end=0x2a43355734071576a72c703f7b795cb2732afbf79edbd04618eb8c4171d76744:18918141 start_time=1,709,930,603
t=2024-03-17T10:37:50+0000 lvl=info msg="New L1 finalized block" l1_finalized=0x65c84a4323ca95c58ffa2c4fbe5c3ca5149dca4d4d9e9eadec3e7c7aa217cabf:37045600
t=2024-03-17T10:37:50+0000 lvl=info msg="received L1 finality signal, but missing data for immediate L2 finalization" prev_finalized_l1=0x65c84a4323ca95c58ffa2c4fbe5c3ca5149dca4d4d9e9eadec3e7c7aa217cabf:37045600 signaled_finalized_l1=0x65c84a4323ca95c58ffa2c4fbe5c3ca5149dca4d4d9e9eadec3e7c7aa217cabf:37045600
t=2024-03-17T10:37:50+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:50+0000 lvl=info msg="New L1 safe block" l1_safe=0x8e8135333146af863d3c048aad0321c595eb53c05acbbb813c4c2305a4a26a5f:37045601
t=2024-03-17T10:37:50+0000 lvl=dbug msg="discovered peer" peer=16Uiu2HAmSDQYV7jnXRVNPgbM7VeCfGWBCfNngpbEmQ3rDTtzM6o4 nodeID=7044568b6b1df0444ccb989570a27ada097adab67198ce1f74f5c0e387fd5cb3 addr=/ip4/52.193.218.151/tcp/9003
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xfa267d23ed5467de28f2b9faf2cdff4518b18e428c5d8f1f53db2307cef46790 blockNumber=36,748,952
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x518fd71216ac40f04b5f407b3823f835d1abc992ae871173833ab5c4bb484885 blockNumber=36,748,959
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xe696ba2655a3c95fe80e4eb005d633ba73783ddff137079a731db5106fb81725 blockNumber=36,748,961
t=2024-03-17T10:37:50+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xbf534a584a04455ae494693499aec5ffb426d1d15b2ddf0e9f25d288ba380c39 blockNumber=36,748,956
t=2024-03-17T10:37:51+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x66edcb0ac875de8110164edf77bd7b86871f2c7bf2040e7dde0166fa38f24f00 blockNumber=36,748,960
t=2024-03-17T10:37:51+0000 lvl=dbug msg="validated block" result=ACCEPT from=16Uiu2HAmBQuEvcdUjzFPoY7mKiBUKthwVHjWTnmiVtgQ2MqRnrrH
t=2024-03-17T10:37:51+0000 lvl=info msg="Received signed execution payload from p2p" id=0xd0bd52d77c3da5827346738fdbb1069327ee1f48359e09cdf9987a66447fa804:18918148 peer=16Uiu2HAmBQuEvcdUjzFPoY7mKiBUKthwVHjWTnmiVtgQ2MqRnrrH
t=2024-03-17T10:37:51+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xd0bd52d77c3da5827346738fdbb1069327ee1f48359e09cdf9987a66447fa804:18918148
t=2024-03-17T10:37:51+0000 lvl=dbug msg="ignoring step request, already scheduled re-attempt after previous failure" attempts=1
t=2024-03-17T10:37:51+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x21853c48a1f134f213e8db0e8c5caf63b802eefbd2c1f18ac3ab26984dd9cbdf blockNumber=36,748,958
t=2024-03-17T10:37:51+0000 lvl=dbug msg="Removed dead node" p2p=discv5 b=14 id=cb59dbfad4d76e3dc48c67c822a92e5006df7c17234be1a7cb6cd172d9e8f43a ip=195.201.71.223 checks=0
t=2024-03-17T10:37:51+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xdfbdcdc53ca64adae5b490639b087e61915bdca7b88e334fab6e016ced3bc5a7 blockNumber=36,748,953
t=2024-03-17T10:37:51+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x79ded6f894377dd27f3ccedd61c9fd13ed3ecc33b8ed878de95e9bda57770b33 blockNumber=36,748,955
t=2024-03-17T10:37:51+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0x45dc47a8d3a03f60813fcf213e4199740df5775d9e9cf4e8082c42819369513f blockNumber=36,748,954
t=2024-03-17T10:37:51+0000 lvl=dbug msg="pre fetch receipts fail without error,need retry" blockHash=0xfe27a1d1b36b277c82e6abbd2b42b18436ac74d60187dbe9f39e90754d3b0974 blockNumber=36,748,957
@e1ghtball Yes, the useful one is the following:
t=2024-03-17T10:37:49+0000 lvl=eror msg="Payload execution failed" block_hash=0x2a43355734071576a72c703f7b795cb2732afbf79edbd04618eb8c4171d76744 err="Post \"http://localhost:8551\": context deadline exceeded"
This means that your op-geth seems to have some problems, you can check the op-geth log to confirm the issue. Based on recent cases I have seen, I guess you may have encountered a problem related to db compaction. You can search for the keyword "Database compacting, degraded performance" in the op-geth logs. If you can find it, please follow the instructions in this document to try to fix the issue.
@welkin22 yes, I see logs like Database compacting, degraded performance
every minute.
And yes, I'm running archive node as described in document that you shared.
But specs of my machine are pretty good: Ryzen 9 5900X (12 cores/24 threads), 128GB RAM.
Also, I don't see any CPU load at all - screenshot from btop
Any chances to fix this problem without switching to pruned node?
@welkin22 yes, I see logs like
Database compacting, degraded performance
every minute. And yes, I'm running archive node as described in document that you shared. But specs of my machine are pretty good: Ryzen 9 5900X (12 cores/24 threads), 128GB RAM. Also, I don't see any CPU load at all - screenshot from btop Any chances to fix this problem without switching to pruned node?
@e1ghtball The key indicator is the performance of the disk, and you need to pay attention to the disk's throughput capacity. Your CPU and memory are already sufficient.
@welkin22 I tested geth working on Hetzner machines, which should definitely not have weak SSDs - the problem persisted When I upgraded to version 0.3.0, the problem persisted too
The problem only resolved with upgrading to 0.3.1 alpha and using pebble db. There were no snapshots for it, I had to wait until the state synchronises from scratch, so I write with such a delay
@welkin22 I tested geth working on Hetzner machines, which should definitely not have weak SSDs - the problem persisted When I upgraded to version 0.3.0, the problem persisted too
The problem only resolved with upgrading to 0.3.1 alpha and using pebble db. There were no snapshots for it, I had to wait until the state synchronises from scratch, so I write with such a delay
@e1ghtball Thank you for your feedback. 0.3.1 alpha is still in the public testing phase, and when we finally release it, we will plan to provide snapshots to help users quickly start new nodes.
Hello,
I'm experiencing the same issue on our end with database compacting and degraded performance. We are using NVMe disks with a RAID 0 configuration, Intel Xeon-E 2388G - 8c/16t - 3.2GHz/4.6GHz, and 64GB DDR4 ECC 3200MHz. When do you think the Pebble DB snapshot will be released?
Thanks
Hello,
I'm experiencing the same issue on our end with database compacting and degraded performance. We are using NVMe disks with a RAID 0 configuration, Intel Xeon-E 2388G - 8c/16t - 3.2GHz/4.6GHz, and 64GB DDR4 ECC 3200MHz. When do you think the Pebble DB snapshot will be released?
Thanks
@mateipopp Hello, if you want to run an archive node with Pebble DB, we recommend syncing a new node from scratch. Currently, we do not have plans to provide an archive snapshot.
System information
Network:
mainnet
if you are running a local node, please provide the following information: op-node version:
op-node version v0.0.0-8f623e64-1705481865
op-geth version:op-geth version 0.1.0-unstable-77340c47
OS & Version:Rocky Linux 9.2
Expected behaviour
op-node: regular
Optimistically queueing unsafe L2 execution payload
/Sync progress
/Received signed execution payload from p2p
logs op-geth regularChain head was updated
/Imported new potential chain segment
logsActual behaviour
op-node: A lot of
pre fetch receipts fail without error,need retry
logs with the same block numbers:op-geth:
Peer's transaction rejected
logs:Steps to reproduce the behaviour
start-op-node.sh
:start-op-geth.sh
: