input-output-hk / mithril

Stake-based threshold multi-signatures protocol
https://mithril.network
Apache License 2.0
123 stars 39 forks source link

Document Prometheus metrics and Grafana Dahsboard for signer #1834

Closed jpraynaud closed 2 months ago

jpraynaud commented 2 months ago

Why

Some SPO asked for explanation about the Prometheus metrics and Grafana dashboard for the Mithril signer node.

What

Add some explanation about the metrics in the signer node in the documentation.

How

dlachaume commented 2 months ago

This is a proposal to expand the description of the Grafana dashboard template page:


We recommend configuring the dashboard to retrieve data over a meaningful period (e.g., 24 hours or a week).

Below is a detailed table outlining the metrics for better understanding and monitoring of the Mithril signer node:

Metrics Description
Signer Registration Last Epoch The epoch during which the last signer registration occurred
Signer Registration Success Number of successful signer registrations
Signer Registration Errors Number of errors that occurred during signer registrations
Signature Registration Last Epoch The epoch during which the last signature registration occurred
Signature Registration Success Number of successful signature registrations
Signature Registration Errors Number of errors that occurred during signature registrations
State Machine Cycles Total number of state machine cycles
State Machine Cycles Success Number of state machine cycles that ended successfully
State Machine Cycles Errors Number of state machine cycles that ended due to an error

It is not concerning to have an error rate not exceeding 1% on the State Machine Cycles.

Feel free to reach out to us on the Discord channel for questions and/or help.