Open Rjected opened 1 year ago
im down to help out with this.
Go for it Max - and yes it is insane how high quality these geth metrics are
hey actually a bit confused. DisconnectMetrics gets incremented both for incoming and outgoing dials. "dial metrics" seems to imply it is only relevant for outgoing dials
maybe instead we should keep one set of disconnect metrics on network/src/manager.rs
for outgoing, one for incoming, and then add the remaining metrics for outgoing dials (rlpx, connection)? otherwise its a bit logically inconsistent.
eg maybe
pub struct NetworkManager<C> {
..
metrics: NetworkMetrics,
incoming_disconnect_metrics: DisconnectMetrics,
outgoing_disconnect_metrics: DisconnectMetrics,
dial_metrics: DialMetrics
@Rjected
hey actually a bit confused. DisconnectMetrics gets incremented both for incoming and outgoing dials. "dial metrics" seems to imply it is only relevant for outgoing dials
maybe instead we should keep one set of disconnect metrics on
network/src/manager.rs
for outgoing, one for incoming, and then add the remaining metrics for outgoing dials (rlpx, connection)? otherwise its a bit logically inconsistent.eg maybe
pub struct NetworkManager<C> { .. metrics: NetworkMetrics, incoming_disconnect_metrics: DisconnectMetrics, outgoing_disconnect_metrics: DisconnectMetrics, dial_metrics: DialMetrics
@Rjected
That makes sense. BTW just unassigned, because having people assigned on larger issues like this seems to discourage others from working on them, so I'm taking a more FCFS approach to reviewing PRs
This issue is stale because it has been open for 21 days with no activity.
I'm working on adding eth
handshake error metrics. Looks like the changes made in https://github.com/paradigmxyz/reth/pull/3729 were never added to the etc/grafana/dashboards/overview.json file. I can open a separate PR to add those metrics.
I'm working on adding
eth
handshake error metrics. Looks like the changes made in #3729 were never added to the etc/grafana/dashboards/overview.json file. I can open a separate PR to add those metrics.
that is an RLPx metric! it needs to go in the tx pool dashboard not the discovery dashboard cc @Rjected
all issues here need to take into account https://github.com/paradigmxyz/reth/issues/8150
I am applying to this issue via OnlyDust platform.
Hello guys, I have an extensive experience on devops, security and monitoring. For instance, I am the lead developer of Tikuna (https://tikuna.io/) a security monitoring tool for Ethereum. I think I could help implementing one or several of the measurement metrics.
I could use as example the provided go-ethereum code and provide similar implementations on Reth plus the required Grafana dashboards.
Describe the feature
I was very impressed by https://github.com/ethereum/go-ethereum/pull/27621 and thought it would be a good idea for us to replicate some of these metrics. This tracks what we would need to add in reth, and what we would need to add in the existing grafana dashboard. Each of these tasks are likely small, so anyone who would like to take one should let me know, and I'll create an issue to track the individual task!
Reth - Peer Discovery dashboard
Discovery
We would need an additional metric for nodes in kbuckets, and graph in grafana:
Reth - Transaction Pool dashboard
eth
handshakeWe would need to create metrics for
eth
handshake errors that map to these:p2p
dialsWe already have some error metrics: https://github.com/paradigmxyz/reth/blob/526f624e1cbfd659184873bffe12ec602678b57f/crates/net/network/src/metrics.rs#L64-L110
So we should make sure that the error metrics are in the grafana dashboard and named appropriately, and add the metrics which we don't already have:
Out of these, I don't think we have a metric similar to
p2p/dials/success
.Grafana
In addition to these, it would be nice to create a graph similar to the
Dial Quality
graph from the original PR:Additional context
No response