paradigmxyz / reth

Modular, contributor-friendly and blazing-fast implementation of the Ethereum protocol, in Rust
https://reth.rs/
Apache License 2.0
3.59k stars 936 forks source link

stuck at transactionlookup during initial sync beta.6 #8005

Open mrabino1 opened 2 months ago

mrabino1 commented 2 months ago

Describe the bug

stuck at transactionlookup during initial sync beta.6

Steps to reproduce

start beta.6 .. wait..

Node logs

t$ tail -f reth.log 
2024-04-30T18:00:44.892382Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:00:47.097225Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:00:47.097291Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:00:50.587815Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:00:53.069736Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:00:56.289932Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:00:56.619768Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:00:57.527203Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:00:58.589300Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:00:58.589901Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:01.071086Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:01.071226Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:04.620290Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:04.620334Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:05.528549Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:05.528815Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:07.284335Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:07.322339Z  INFO reth::cli: Status connected_peers=100 freelist=20 stage=TransactionLookup checkpoint=0 target=19724655
2024-04-30T18:01:08.290930Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:11.285505Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:12.414276Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:12.414441Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:15.285251Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:15.285474Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:18.462425Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:18.487952Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:19.286306Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:19.286538Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:20.291347Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:22.578627Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:26.463553Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:26.463763Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:26.489068Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:26.489093Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:30.579367Z DEBUG jsonrpsee-server: HTTP serve connection failed hyper::Error(IncompleteMessage)
2024-04-30T18:01:30.579427Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:32.292617Z DEBUG jsonrpsee-server: Accepting new connection 1/500
2024-04-30T18:01:32.321935Z  INFO reth::cli: Status connected_peers=100 freelist=20 stage=TransactionLookup checkpoint=0 target=19724655
2024-04-30T18:01:33.887100Z DEBUG jsonrpsee-server: Accepting new connection 1/500

Platform(s)

Linux (x86)

What version/commit are you on?

beta.6

What database version are you on?

beta.6

Which chain / network are you on?

mainnet

What type of node are you running?

Full via --full flag

What prune config do you use, if any?

No response

If you've built Reth from source, provide the full command you used

No response

Code of Conduct

Bjorn248 commented 2 months ago

I am experiencing the same behavior. Happy to share logs and metrics if needed.

Bjorn248 commented 2 months ago

@mrabino1 I was able to get this to progress by changing chunk_size to 500000 under [stages.transaction_lookup] in reth.toml.

EDIT: it seems like it is stuck again after doing some work this time

2024-05-08T00:36:48.788883Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2325965621..2326465674
2024-05-08T00:36:49.130833Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2326465674..2326965744
2024-05-08T00:36:49.476585Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2326965744..2327465861
2024-05-08T00:36:49.784894Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2327465861..2327965945
2024-05-08T00:36:50.082491Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2327965945..2328466063
2024-05-08T00:36:50.410233Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2328466063..2328966073
2024-05-08T00:36:50.731880Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2328966073..2329466293
2024-05-08T00:36:51.080499Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2329466293..2329966329
2024-05-08T00:36:51.433821Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2329966329..2330466429
2024-05-08T00:36:51.753505Z  INFO sync::stages::transaction_lookup: Calculating transaction hashes tx_range=2330466429..2330851491
2024-05-08T00:37:11.990142Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-08T00:37:37.000067Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-08T00:38:01.993013Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-08T00:38:26.997789Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-08T00:38:51.990082Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-08T00:39:16.991909Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-08T00:39:41.989875Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-08T00:40:06.989553Z  INFO reth::cli: Status connected_peers=49 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
emhane commented 2 months ago

seems etl related @joshieDo

yixun-alpha commented 2 months ago

Encountered same issue on beta.6 (another issue with similar symptoms #7272), have changed the chunk_size to 500000 as well -- currently progressing, we'll see if it gets stuck again...

Bjorn248 commented 2 months ago

Found a new line in the logs in case it is helpful.

2024-05-09T23:00:40.094650Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-09T23:01:05.095367Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-09T23:01:30.108897Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-09T23:01:55.098311Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-09T23:02:20.095285Z  INFO reth::cli: Status connected_peers=50 freelist=98259483 stage=TransactionLookup checkpoint=0 target=19636464
2024-05-09T23:07:44.580792Z  WARN libmdbx: Long-lived read transaction has been timed out open_duration=300.001953754s

After this point, reth didn't produce any additional log output until it was restarted.

mattsse commented 2 months ago

2024-05-09T23:07:44.580792Z WARN libmdbx: Long-lived read transaction has been timed out open_duration=300.001953754s

@shekhirin why does this happen here?

https://github.com/paradigmxyz/reth/blob/aa5c52b944ba47f59512700571a19d8ae877abb2/crates/stages/src/stages/tx_lookup.rs#L118-L119

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 21 days with no activity.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 7 days since being marked as stale.