paradigmxyz / reth

Modular, contributor-friendly and blazing-fast implementation of the Ethereum protocol, in Rust
https://reth.rs/
Apache License 2.0
3.94k stars 1.18k forks source link

Timeouts with Lighthouse CL #7322

Open 0xAlcibiades opened 7 months ago

0xAlcibiades commented 7 months ago

Describe the bug

Screenshot 2024-03-24 at 20 14 54

Have verified sub millisecond (0.2ms) connectivity between reth and lighthouse with no packet drops.

Steps to reproduce

Run reth beta 3 with lighthouse v5.1.2, see timeouts in the log

Node logs

No response

Platform(s)

Linux (ARM)

What version/commit are you on?

v0.2.0-beta.3

What database version are you on?

2

What type of node are you running?

Archive (default)

What prune config do you use, if any?

No response

If you've built Reth from source, provide the full command you used

No response

Code of Conduct

mattsse commented 7 months ago

could you please share the reth output for that timeframe

0xAlcibiades commented 7 months ago

So, it already log rotated out, I'll setup longer retention. The issue cleared after a few restarts of both.

0xAlcibiades commented 7 months ago

Ok, was able to see this again. Reth lost the chain tip for some reason:

Screenshot 2024-03-25 at 20 31 55 Screenshot 2024-03-25 at 20 31 36
pires commented 7 months ago

I am seeing this constantly with v0.2.0-beta.4 and Lighthouse v5.1.3, both tracking Holesky.

reth logs

:; sudo journalctl --since "5m ago" -fu reth
Apr 03 08:01:48 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:01:48.430067Z  INFO Status connected_peers=20 freelist=154228 latest_block=1271640
Apr 03 08:02:13 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:13.392638Z  INFO Status connected_peers=20 freelist=154228 latest_block=1271640
Apr 03 08:02:38 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:38.390798Z  INFO Status connected_peers=21 freelist=154228 latest_block=1271640
Apr 03 08:02:40 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:40.176525Z  INFO Canonical chain committed number=1271684 hash=0xe154ed4211a29ac9f80dff71b5d2f3884abb9b043557930f4ab8c597a8df5ea9 elapsed=539.833360863s
Apr 03 08:02:40 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:40.206080Z  INFO Block is already canonical, ignoring. block_hash=0x6b1b8845d971ef5e22c9f00e9e10b756a20329e3c2dafb0eb7cead4a79ac57b9
Apr 03 08:02:40 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:40.319783Z  INFO Pruner started tip_block_number=1271684
Apr 03 08:02:40 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:40.319857Z  INFO Static File Producer started targets=StaticFileTargets { headers: Some(1271587..=1271617), receipts: Some(1271587..=1271617), transactions: Some(1271587..=1271617) }
Apr 03 08:02:40 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:40.638166Z  INFO Pruner finished tip_block_number=1271684 elapsed=318.398433ms stats={Receipts: (Finished, 1391), Headers: (Finished, 186), Transactions: (Finished, 1391)}
Apr 03 08:02:40 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:40.850325Z  INFO Static File Producer finished targets=StaticFileTargets { headers: Some(1271587..=1271617), receipts: Some(1271587..=1271617), transactions: Some(1271587..=1271617) } elapsed=530.41249ms
Apr 03 08:02:41 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:41.200522Z  INFO Block is already canonical, ignoring. block_hash=0x6b1b8845d971ef5e22c9f00e9e10b756a20329e3c2dafb0eb7cead4a79ac57b9
Apr 03 08:02:47 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:47.355549Z  INFO Block is already canonical, ignoring. block_hash=0x253dcb8a1f8db00cd2150a8aaacddc1e8445bd145271d1247474233c3ff7293e
Apr 03 08:02:48 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:48.032833Z  INFO Block is already canonical, ignoring. block_hash=0x70c7473f508b0b396fd223828302690fe52aba3ae24cd8432156928a21535d16
Apr 03 08:02:49 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:49.318919Z  INFO Block is already canonical, ignoring. block_hash=0x1b553453a9d8cf3fbb11686d487ed054c7365379bdee6f22cda5b0307fe12019
Apr 03 08:02:55 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:02:55.228573Z  INFO Block is already canonical, ignoring. block_hash=0x2f9b8ac9742b11f8772a5fa45516ec534801abc626a5db2e36fbd839846d5601
Apr 03 08:03:03 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:03:03.395901Z  INFO Status connected_peers=22 freelist=185777 latest_block=1271684
Apr 03 08:03:28 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:03:28.397712Z  INFO Status connected_peers=22 freelist=185777 latest_block=1271684
Apr 03 08:03:53 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:03:53.400624Z  INFO Status connected_peers=22 freelist=185777 latest_block=1271684
Apr 03 08:04:18 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:04:18.414610Z  INFO Status connected_peers=22 freelist=185777 latest_block=1271684
Apr 03 08:04:43 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:04:43.395862Z  INFO Status connected_peers=23 freelist=185777 latest_block=1271684
Apr 03 08:05:08 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:05:08.500964Z  INFO Status connected_peers=22 freelist=185777 latest_block=1271684
Apr 03 08:05:33 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:05:33.390415Z  INFO Status connected_peers=20 freelist=185777 latest_block=1271684
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:05:58.391219Z  INFO Status connected_peers=19 freelist=185777 latest_block=1271684
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:05:58.741640Z  WARN The database read transaction has been open for too long. Backtrace:
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    0: reth_db::implementation::mdbx::tx::MetricsHandler<K>::log_backtrace_on_long_read_transaction
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    1: core::ptr::drop_in_place<reth_db::implementation::mdbx::tx::MetricsHandler<reth_libmdbx::transaction::RO>>
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    2: core::ptr::drop_in_place<reth_provider::providers::database::provider::DatabaseProvider<reth_db::implementation::mdbx::tx::Tx<reth_libmdbx::transaction::RO>>>
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    3: rayon::iter::plumbing::bridge_producer_consumer::helper
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    4: rayon_core::join::join_context::{{closure}}
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    5: rayon::iter::plumbing::bridge_producer_consumer::helper
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    6: rayon_core::join::join_context::{{closure}}
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    7: rayon::iter::plumbing::bridge_producer_consumer::helper
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    8: rayon_core::join::join_context::{{closure}}
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:    9: rayon::iter::plumbing::bridge_producer_consumer::helper
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   10: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   11: rayon_core::registry::WorkerThread::wait_until_cold
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   12: rayon_core::join::join_context::{{closure}}
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   13: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   14: rayon_core::registry::WorkerThread::wait_until_cold
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   15: std::sys_common::backtrace::__rust_begin_short_backtrace
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   16: core::ops::function::FnOnce::call_once{{vtable.shim}}
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   17: std::sys::pal::unix::thread::Thread::new::thread_start
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   18: <unknown>
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:   19: <unknown>
Apr 03 08:05:58 t-eth-holesky-02 reth[1047094]:  open_duration=173.946531598s self.txn_id=30939
Apr 03 08:06:20 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:06:20.370740Z  WARN Beacon client online, but no consensus updates received for a while. Please fix your beacon client to follow the chain! period=205.141156666s
Apr 03 08:06:23 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:06:23.391253Z  INFO Status connected_peers=19 freelist=185777 latest_block=1271684
Apr 03 08:06:48 t-eth-holesky-02 reth[1047094]: 2024-04-03T08:06:48.391019Z  INFO Status connected_peers=19 freelist=185777 latest_block=1271684
github-actions[bot] commented 6 months ago

This issue is stale because it has been open for 21 days with no activity.

jasonlinSxT commented 3 months ago

Still seeing same issue with Lighthouse CL with reth v0.2.0-beta.9

YeautyYE commented 1 month ago

Still seeing same issue with Lighthouse CL with reth v1.0.6

richyk1 commented 1 month ago

Encountering the same issue it seems on Reth v1.0.6 and Lighthouse image - lighthouse:v5.1.3-modern

lighthouse-1  | Sep 30 12:30:56.737 WARN Error whilst processing payload status  error: Api { error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out) }, service: exec
lighthouse-1  | Sep 30 12:30:59.574 WARN Execution engine call failed            error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out), service: exec
lighthouse-1  | Sep 30 12:30:59.574 WARN Error whilst processing payload status  error: Api { error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out) }, service: exec
lighthouse-1  | Sep 30 12:31:05.000 INFO Syncing                                 est_time: --, distance: 45 slots (9 mins), peers: 60, service: slot_notifier
lighthouse-1  | Sep 30 12:31:13.569 WARN Execution engine call failed            error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out), service: exec
lighthouse-1  | Sep 30 12:31:13.569 WARN Error whilst processing payload status  error: Api { error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out) }, service: exec
reth-1        | 2024-09-30T12:31:16.730788Z  INFO Status connected_peers=12 latest_block=20863437
lighthouse-1  | Sep 30 12:31:17.000 INFO Syncing                                 est_time: --, distance: 46 slots (9 mins), peers: 61, service: slot_notifier
lighthouse-1  | Sep 30 12:31:29.000 INFO Syncing                                 est_time: --, distance: 47 slots (9 mins), peers: 65, service: slot_notifier
lighthouse-1  | Sep 30 12:31:41.001 INFO Syncing                                 est_time: --, distance: 48 slots (9 mins), peers: 63, service: slot_notifier
reth-1        | 2024-09-30T12:31:41.730428Z  INFO Status connected_peers=11 latest_block=20863437
lighthouse-1  | Sep 30 12:31:42.121 WARN Execution engine call failed            error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out), service: exec
lighthouse-1  | Sep 30 12:31:42.121 WARN Error whilst processing payload status  error: Api { error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out) }, service: exec
lighthouse-1  | Sep 30 12:31:45.769 WARN Execution engine call failed            error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out), service: exec
lighthouse-1  | Sep 30 12:31:45.769 WARN Error whilst processing payload status  error: Api { error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out) }, service: exec
lighthouse-1  | Sep 30 12:31:53.000 INFO Syncing                                 est_time: --, distance: 49 slots (9 mins), peers: 67, service: slot_notifier
lighthouse-1  | Sep 30 12:31:55.916 WARN Execution engine call failed            error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out), service: exec
lighthouse-1  | Sep 30 12:31:55.916 WARN Error whilst processing payload status  error: Api { error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out) }, service: exec
lighthouse-1  | Sep 30 12:31:58.706 WARN Execution engine call failed            error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out), service: exec
lighthouse-1  | Sep 30 12:31:58.707 WARN Error whilst processing payload status  error: Api { error: HttpClient(url: http://reth:8551/, kind: timeout, detail: operation timed out) }, service: exec
lacotWin commented 2 weeks ago

I am getting same timeout errors in v1.0.3