paradigmxyz / reth

Modular, contributor-friendly and blazing-fast implementation of the Ethereum protocol, in Rust
https://reth.rs/
Apache License 2.0
3.8k stars 1.04k forks source link

Missing metrics under `--engine.experimental` #10630

Closed kousu closed 1 week ago

kousu commented 2 weeks ago

Describe the bug

When using --engine.experimental from the new release some metrics are incorrectly 0. For example, I scrape

reth_blockchain_tree_canonical_chain_height 0

even though my logs report

INFO Block added to canonical chain number=3663351 hash=0xd19bf53f640ba5359f8ac20cc689414b72e68bb6ebacaac1525517d24940c032 peers=41 txs=41 gas=7.37 Mgas gas_throughput=144.71 Mgas/second full=24.6% base_fee=0.00gwei blobs=0 excess_blobs=0 elapsed=50.903226ms

Removing --engine.experimental returns the metrics.


There is a longer set of missing metrics, I believe. Without --engine.experimental I did

curl -s http://localhost:9001/metrics > engine1.txt

and with it I did

curl -s http://localhost:9001/metrics > engine2.txt

then I set diffed them to find the missing metrics

grep -vxF -f <(grep -E '0$' engine1.txt) engine2.txt | grep -E '0$' ``` $ grep -vxF -f <(grep -E '0$' engine1.txt) engine2.txt | grep -E '0$' reth_blockchain_tree_trie_updates_insert_cached 0 reth_payloads_failed_payload_builds 0 reth_transaction_pool_reinserted_transactions 0 reth_network_occurrences_hash_already_seen_by_peer 1600 reth_network_disconnect_requested 0 reth_network_fetched_transactions 2380 reth_database_operation_calls_total{table="BlockWithdrawals",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="StoragesHistory",operation="cursor-insert"} 0 reth_database_operation_calls_total{table="BlockBodyIndices",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="HeaderNumbers",operation="delete"} 0 reth_database_operation_calls_total{table="StoragesHistory",operation="cursor-upsert"} 0 reth_database_operation_calls_total{table="AccountsHistory",operation="cursor-insert"} 0 reth_database_operation_calls_total{table="ChainState",operation="put"} 0 reth_database_operation_calls_total{table="TransactionBlocks",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="AccountsHistory",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="StoragesHistory",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="Receipts",operation="cursor-append"} 0 reth_database_operation_calls_total{table="TransactionHashNumbers",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="StorageChangeSets",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="Bytecodes",operation="get"} 1290 reth_database_operation_calls_total{table="AccountChangeSets",operation="cursor-delete-current"} 0 reth_database_operation_calls_total{table="AccountsHistory",operation="cursor-upsert"} 0 reth_blockchain_tree_trie_updates_insert_recomputed 0 reth_network_eth_headers_requests_received_total 0 reth_blockchain_tree_reorgs 0 reth_io_write_bytes 201523200 reth_network_useless_peer 0 reth_io_syscw 5140 reth_io_rchar 1575130 reth_transaction_pool_drift_count 0 reth_network_eth_bodies_requests_received_total 0 reth_network_messages_with_hashes_already_seen_by_peer 1320 reth_blockchain_tree_longest_sidechain_height 0 reth_blockchain_tree_latest_reorg_depth 0 reth_db_table_pages{table="Receipts",type="leaf"} 0 reth_db_table_pages{table="StoragesTrie",type="branch"} 59620 reth_db_table_pages{table="Receipts",type="branch"} 0 reth_db_table_size{table="BlockWithdrawals"} 1824911360 reth_db_table_size{table="Receipts"} 0 reth_network_pending_outgoing_connections 0 reth_network_pending_pool_imports 0 reth_db_table_entries{table="Receipts"} 0 reth_blockchain_tree_canonical_chain_height 0 reth_jemalloc_resident 167813120 reth_jemalloc_metadata 23089120 reth_process_virtual_memory_bytes 4485205401600 reth_payloads_active_jobs 0 reth_trie_leaves_added{type="storage",quantile="0.5"} 0 reth_trie_leaves_added_count{type="storage"} 2620 reth_consensus_engine_persistence_prune_before_duration_seconds{quantile="0"} 0 reth_consensus_engine_persistence_prune_before_duration_seconds{quantile="0.5"} 0 reth_consensus_engine_persistence_prune_before_duration_seconds{quantile="0.9"} 0 reth_consensus_engine_persistence_prune_before_duration_seconds{quantile="0.95"} 0 reth_consensus_engine_persistence_prune_before_duration_seconds{quantile="0.99"} 0 reth_consensus_engine_persistence_prune_before_duration_seconds{quantile="0.999"} 0 reth_consensus_engine_persistence_prune_before_duration_seconds{quantile="1"} 0 reth_consensus_engine_persistence_prune_before_duration_seconds_sum 0 reth_consensus_engine_persistence_prune_before_duration_seconds_count 0 reth_storage_providers_database_insert_history_indices{quantile="0"} 0 reth_storage_providers_database_insert_history_indices{quantile="0.5"} 0 reth_storage_providers_database_insert_history_indices{quantile="0.9"} 0 reth_storage_providers_database_insert_history_indices{quantile="0.95"} 0 reth_storage_providers_database_insert_history_indices{quantile="0.99"} 0 reth_storage_providers_database_insert_history_indices{quantile="0.999"} 0 reth_storage_providers_database_insert_history_indices{quantile="1"} 0 reth_storage_providers_database_insert_history_indices_sum 0 reth_storage_providers_database_insert_history_indices_count 0 reth_trie_branches_added_sum{type="state"} 95330 reth_trie_branches_added_count{type="storage"} 2620 reth_storage_providers_database_insert_block{quantile="0"} 0 reth_storage_providers_database_insert_block{quantile="0.5"} 0 reth_storage_providers_database_insert_block{quantile="0.9"} 0 reth_storage_providers_database_insert_block{quantile="0.95"} 0 reth_storage_providers_database_insert_block{quantile="0.99"} 0 reth_storage_providers_database_insert_block{quantile="0.999"} 0 reth_storage_providers_database_insert_block{quantile="1"} 0 reth_storage_providers_database_insert_block_sum 0 reth_storage_providers_database_insert_block_count 0 reth_storage_providers_database_insert_state{quantile="0"} 0 reth_storage_providers_database_insert_state{quantile="0.5"} 0 reth_storage_providers_database_insert_state{quantile="0.9"} 0 reth_storage_providers_database_insert_state{quantile="0.95"} 0 reth_storage_providers_database_insert_state{quantile="0.99"} 0 reth_storage_providers_database_insert_state{quantile="0.999"} 0 reth_storage_providers_database_insert_state{quantile="1"} 0 reth_storage_providers_database_insert_state_sum 0 reth_storage_providers_database_insert_state_count 0 reth_storage_providers_database_update_pipeline_stages{quantile="0"} 0 reth_storage_providers_database_update_pipeline_stages{quantile="0.5"} 0 reth_storage_providers_database_update_pipeline_stages{quantile="0.9"} 0 reth_storage_providers_database_update_pipeline_stages{quantile="0.95"} 0 reth_storage_providers_database_update_pipeline_stages{quantile="0.99"} 0 reth_storage_providers_database_update_pipeline_stages{quantile="0.999"} 0 reth_storage_providers_database_update_pipeline_stages{quantile="1"} 0 reth_storage_providers_database_update_pipeline_stages_sum 0 reth_storage_providers_database_update_pipeline_stages_count 0 reth_storage_providers_database_insert_hashes{quantile="0"} 0 reth_storage_providers_database_insert_hashes{quantile="0.5"} 0 reth_storage_providers_database_insert_hashes{quantile="0.9"} 0 reth_storage_providers_database_insert_hashes{quantile="0.95"} 0 reth_storage_providers_database_insert_hashes{quantile="0.99"} 0 reth_storage_providers_database_insert_hashes{quantile="0.999"} 0 reth_storage_providers_database_insert_hashes{quantile="1"} 0 reth_storage_providers_database_insert_hashes_sum 0 reth_storage_providers_database_insert_hashes_count 0 reth_database_transaction_commit_gc_cputime_duration_seconds_sum{mode="read-write",outcome="commit"} 0 reth_trie_duration_seconds_count{type="storage"} 2620 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds{quantile="0"} 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds{quantile="0.5"} 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds{quantile="0.9"} 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds{quantile="0.95"} 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds{quantile="0.99"} 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds{quantile="0.999"} 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds{quantile="1"} 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds_sum 0 reth_consensus_engine_persistence_remove_blocks_above_duration_seconds_count 0 reth_network_transaction_fetcher_legacy{quantile="1"} 360 ``` For example, in engine1.txt, all these metrics are filled in: ``` $ grep -E 'reth_blockchain_tree_trie_updates_insert_cached|reth_payloads_failed_payload_builds|reth_transaction_pool_reinserted_transactions|reth_network_occurrences_hash_already_seen_by_peer|reth_network_disconnect_requested' engine1.txt | grep -v '#' reth_blockchain_tree_trie_updates_insert_cached 60 reth_network_occurrences_hash_already_seen_by_peer 14581 reth_network_disconnect_requested 9 reth_payloads_failed_payload_builds 19 reth_transaction_pool_reinserted_transactions 1 ```

Steps to reproduce

  1. Install reth
  2. Add --engine.experimental to the command line flags
  3. Start reth
  4. curl -s http://localhost:9001/metrics | grep reth_blockchain_tree_canonical_chain_height, or one of the other missing metrics, to see it is 0

Node logs

reth.log

Platform(s)

Linux (x86)

What version/commit are you on?

c228fe15808c3acbf18dc3af1a03ef5cbdcda07a

What database version are you on?

Current database version: 2 Local database version: 2

Which chain / network are you on?

80084 (berachain bArtio v2)

What type of node are you running?

Full via --full flag

What prune config do you use, if any?

[prune]
block_interval = 5

[prune.segments]
sender_recovery = "full"

[prune.segments.account_history]
distance = 10064

[prune.segments.storage_history]
distance = 10064

[prune.segments.receipts_log_filter]

If you've built Reth from source, provide the full command you used

cargo build --locked --profile release --bin reth

Code of Conduct

emhane commented 2 weeks ago

@nkysg maybe this would be something for you to try out if you're not yet familiar with observing the node through the metrics dashboard? running the dashboard through docker is by far the easiest https://github.com/paradigmxyz/reth/blob/main/book/installation/docker.md#run-only-grafana-in-docker

emhane commented 2 weeks ago

this public node is useful for comparing performance of local node https://reth.paradigm.xyz/d/2k8BXz24x/reth?orgId=1&refresh=30s

cheddiefender commented 2 weeks ago
Screenshot 2024-08-31 at 11 54 08 AM

Not sure if this is related but i'm not getting updated stage checkpoints and sync progress with the --engine.experimental flag. Works fine without it. The node is syncing correctly but the dashboard doesn't get the values somehow.

yutianwu commented 1 week ago

It does have an issue with the metrics for the new engine as I mentioned in https://github.com/paradigmxyz/reth/issues/10521.

nkysg commented 1 week ago

@nkysg maybe this would be something for you to try out if you're not yet familiar with observing the node through the metrics dashboard? running the dashboard through docker is by far the easiest https://github.com/paradigmxyz/reth/blob/main/book/installation/docker.md#run-only-grafana-in-docker

Thanks for your guiding. If I have capacity,i will try.

emhane commented 1 week ago

duplicate of https://github.com/paradigmxyz/reth/issues/10521