Open georgeee opened 6 days ago
It might be interesting to encode another metric too: we can view the pending snark work as a binary number, based on whether it is ready or not. For example, if the next snark work is the high bit, and we're encoding 16 positions, the first counts for 2^15 ~= 32k
, the second for 2^14 ~= 16k
, etc.
Since there's some redundancy with sequential snark work available, we could also consider starting the running from the first slot after we have sequential work. With this extra data, we'll finally have a true historical view into the scan state's progress.
Develop a better dashboard charts to monitor the snark work production healthiness.
Context
Snark work production is a complicated process which has a great impact on overall healthiness of the network, most importantly on whether transactions from transaction pool get included into blocks as early as possible.
Some recent behavior on mainnet suggests that snark work production might be non-optimal leading to non-full blocks at the time of transaction pool being stocked with valid transactions.
Metrics
Add a Grafana metric that reports how many transactions could be taken (even if above the 128 tx limit) before block producer runs out of snark works. Metric is updated with every change of the snark pool or per interval if it's easier to implement.
Check that there are metrics on:
Dashboard
Configure a dashboard with three plots showing data from different nodes: