erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.09k stars 1.08k forks source link

Engine API: NewPayload fails with a "context canceled" error when calling CurrentHeader/GetHeader #9786

Closed elihallufgil closed 4 months ago

elihallufgil commented 5 months ago

System information

Erigon version: 2.59.0-0f0a4452

OS & Version: ubuntu-jammy-22.04

Erigon Command (with flags/config): erigon --db.size.limit=8TB --chain=mainnet --datadir=/opt/data/erigon --authrpc.addr=0.0.0.0 --authrpc.vhosts= --authrpc.port=8551 --authrpc.jwtsecret=/engine.jwt --prune=default --ws --http --http.vhosts= --http.corsdomain=* --http.addr=0.0.0.0 --http.port=8545 --http.api=engine,net,eth,web3,debug,trace --metrics --metrics.addr=0.0.0.0 --metrics.port=6060 --torrent.download.rate=512mb

Consensus Layer: Prysm v.5.0.1

Consensus Layer Command (with flags/config): beacon-chain --accept-terms-of-use=true --datadir=/opt/app/beacon --p2p-host-dns= --mainnet=true --block-batch-limit=512 --rpc-host=0.0.0.0 --grpc-gateway-host=0.0.0.0 --p2p-max-peers=100 --execution-endpoint=http://127.0.0.1:8545 --monitoring-host=0.0.0.0 --monitoring-port=8080 --p2p-tcp-port=13001 --p2p-udp-port=12001 --jwt-secret=/engine.jwt

Chain/Network: Ethereum Mainnet

Actual behaviour

After we upgraded our archive erigon node from 2.58.2 to 2.59.0, we have a database issue that we didn't have with the previous version. Our applications that are using rpc/ws endpoints of the node, are now causing the following logs at the node:

image

Also sometimes rpc requests are ending with 'context canceled' error:

[rpc] served conn=127.0.0.1:60950 method=eth_getLogs reqid=4 t=37.440560913s err="context canceled"

When the node is not used for rpc requests, everything works fine. We think it is related with some performance issue caused by the new version

dylancaponi commented 5 months ago

Same issue: https://github.com/ledgerwatch/erigon/issues/9767

elihallufgil commented 5 months ago

Is there a way to downgrade to 2.58.2 until this problem will be solved? When we try to downgrade, the node logs a panic message and restarts.

temuera commented 5 months ago

archive node with --internalcl.

WARN[03-24|13:27:14.605] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:27:27.470] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
INFO[03-24|13:27:27.874] RPC Daemon notified of new headers       from=19502329 to=19502330 amount=1 hash=0xba88c0533e86e19a49158fd0aeffbd481e2ec0e5476892eeb703b4d51fdb11a4 header sending=28.143µs log sending=591ns
INFO[03-24|13:27:27.874] head updated                             hash=0xba88c0533e86e19a49158fd0aeffbd481e2ec0e5476892eeb703b4d51fdb11a4 number=19502330
WARN[03-24|13:27:28.645] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:27:39.793] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:27:39.965] RPC Daemon notified of new headers       from=19502330 to=19502331 amount=1 hash=0x240d8b915be69f128a6df8c05d3bd72d70c55cd73e52bf34ca1a9884b37fde23 header sending=33.883µs log sending=481ns
INFO[03-24|13:27:39.965] head updated                             hash=0x240d8b915be69f128a6df8c05d3bd72d70c55cd73e52bf34ca1a9884b37fde23 number=19502331
WARN[03-24|13:27:40.758] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:27:50.718] P2P                                      app=caplin peers=37
INFO[03-24|13:27:56.906] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:27:57.187] RPC Daemon notified of new headers       from=19502331 to=19502332 amount=1 hash=0x77fe14b4061ba2ba0aab1118fc4f8bce47082d160093fa6efc6cd21157f33368 header sending=25.959µs log sending=491ns
INFO[03-24|13:27:57.187] head updated                             hash=0x77fe14b4061ba2ba0aab1118fc4f8bce47082d160093fa6efc6cd21157f33368 number=19502332
WARN[03-24|13:27:57.882] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:28:02.615] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:28:02.839] RPC Daemon notified of new headers       from=19502332 to=19502333 amount=1 hash=0x96087bd28dffd58aaecc866530d5b00a788c0d977b62caf4e3d7e26cda14cf39 header sending=27.552µs log sending=602ns
INFO[03-24|13:28:02.839] head updated                             hash=0x96087bd28dffd58aaecc866530d5b00a788c0d977b62caf4e3d7e26cda14cf39 number=19502333
WARN[03-24|13:28:03.582] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:28:13.374] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:28:13.674] RPC Daemon notified of new headers       from=19502333 to=19502334 amount=1 hash=0xac576282e18cac8cce79b0c0aeddf83300ecaa05d0ce45f7999c9540251bb8b9 header sending=26.049µs log sending=621ns
INFO[03-24|13:28:13.674] head updated                             hash=0xac576282e18cac8cce79b0c0aeddf83300ecaa05d0ce45f7999c9540251bb8b9 number=19502334
WARN[03-24|13:28:14.440] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:28:21.797] [p2p] GoodPeers                          eth68=8 eth67=8
INFO[03-24|13:28:30.335] [mem] memory stats                       Rss=232.3MB Size=0B Pss=232.3MB SharedClean=4B SharedDirty=0B PrivateClean=209.2MB PrivateDirty=23.1MB Referenced=216.1MB Anonymous=23.1MB Swap=0B alloc=12.3GB sys=30.8GB
INFO[03-24|13:28:30.754] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:28:30.942] RPC Daemon notified of new headers       from=19502334 to=19502335 amount=1 hash=0xd5917c99fefa8e75b83a83851b5a87990ef88efc498f6529c24ba9bf6891f3be header sending=26.66µs log sending=481ns
INFO[03-24|13:28:30.943] head updated                             hash=0xd5917c99fefa8e75b83a83851b5a87990ef88efc498f6529c24ba9bf6891f3be number=19502335
WARN[03-24|13:28:31.602] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
Giulio2002 commented 5 months ago

archive node with --internalcl.

WARN[03-24|13:27:14.605] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:27:27.470] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
INFO[03-24|13:27:27.874] RPC Daemon notified of new headers       from=19502329 to=19502330 amount=1 hash=0xba88c0533e86e19a49158fd0aeffbd481e2ec0e5476892eeb703b4d51fdb11a4 header sending=28.143µs log sending=591ns
INFO[03-24|13:27:27.874] head updated                             hash=0xba88c0533e86e19a49158fd0aeffbd481e2ec0e5476892eeb703b4d51fdb11a4 number=19502330
WARN[03-24|13:27:28.645] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:27:39.793] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:27:39.965] RPC Daemon notified of new headers       from=19502330 to=19502331 amount=1 hash=0x240d8b915be69f128a6df8c05d3bd72d70c55cd73e52bf34ca1a9884b37fde23 header sending=33.883µs log sending=481ns
INFO[03-24|13:27:39.965] head updated                             hash=0x240d8b915be69f128a6df8c05d3bd72d70c55cd73e52bf34ca1a9884b37fde23 number=19502331
WARN[03-24|13:27:40.758] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:27:50.718] P2P                                      app=caplin peers=37
INFO[03-24|13:27:56.906] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:27:57.187] RPC Daemon notified of new headers       from=19502331 to=19502332 amount=1 hash=0x77fe14b4061ba2ba0aab1118fc4f8bce47082d160093fa6efc6cd21157f33368 header sending=25.959µs log sending=491ns
INFO[03-24|13:27:57.187] head updated                             hash=0x77fe14b4061ba2ba0aab1118fc4f8bce47082d160093fa6efc6cd21157f33368 number=19502332
WARN[03-24|13:27:57.882] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:28:02.615] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:28:02.839] RPC Daemon notified of new headers       from=19502332 to=19502333 amount=1 hash=0x96087bd28dffd58aaecc866530d5b00a788c0d977b62caf4e3d7e26cda14cf39 header sending=27.552µs log sending=602ns
INFO[03-24|13:28:02.839] head updated                             hash=0x96087bd28dffd58aaecc866530d5b00a788c0d977b62caf4e3d7e26cda14cf39 number=19502333
WARN[03-24|13:28:03.582] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:28:13.374] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:28:13.674] RPC Daemon notified of new headers       from=19502333 to=19502334 amount=1 hash=0xac576282e18cac8cce79b0c0aeddf83300ecaa05d0ce45f7999c9540251bb8b9 header sending=26.049µs log sending=621ns
INFO[03-24|13:28:13.674] head updated                             hash=0xac576282e18cac8cce79b0c0aeddf83300ecaa05d0ce45f7999c9540251bb8b9 number=19502334
WARN[03-24|13:28:14.440] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"
INFO[03-24|13:28:21.797] [p2p] GoodPeers                          eth68=8 eth67=8
INFO[03-24|13:28:30.335] [mem] memory stats                       Rss=232.3MB Size=0B Pss=232.3MB SharedClean=4B SharedDirty=0B PrivateClean=209.2MB PrivateDirty=23.1MB Referenced=216.1MB Anonymous=23.1MB Swap=0B alloc=12.3GB sys=30.8GB
INFO[03-24|13:28:30.754] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload) 
INFO[03-24|13:28:30.942] RPC Daemon notified of new headers       from=19502334 to=19502335 amount=1 hash=0xd5917c99fefa8e75b83a83851b5a87990ef88efc498f6529c24ba9bf6891f3be header sending=26.66µs log sending=481ns
INFO[03-24|13:28:30.943] head updated                             hash=0xd5917c99fefa8e75b83a83851b5a87990ef88efc498f6529c24ba9bf6891f3be number=19502335
WARN[03-24|13:28:31.602] failed to produce and cache attestation data app=caplin stage=ForkChoice err="GetBlockRootAtSlot: slot in the future"

this is caplin but it is safe to ignore the warn is innocuous

battlmonstr commented 4 months ago

Related stack traces:

newPayload_stack

newHandler_stack

battlmonstr commented 4 months ago

@elihallufgil , thanks for the bug report. I'm not sure it is possible to downgrade, but I'm trying to understand and help to solve or workaround the problem.

Here are the things to try:

battlmonstr commented 4 months ago

Some fixes related to this will be included in the upcoming erigon v2.6x https://github.com/ledgerwatch/erigon/commit/bd1a458a8e9b7747c19bfb4bcc817c33c82ba698

Please reopen if it happens again.