Problem

The health of the validation ecosystem is very important. Things can keep working while things are deteriorating. Unless the cost of obtaining this information is very cheap and reliable, then people will not be able to act in time to reallocate stake or recover from impending failure modes. The failure modes we care about are lack of liveness or consistency, possibly as a result of actions by a byzantine adversary. Also having validator metadata identity metadata available on-chain would be very useful

Solution

Query Node

Add support to the QN for rich historical information associated with validators. When this information lives in the QN, it is easy to access for apps like Pioneer, CLI and also for alert-based services that different actors may want to run to detect bad scenarios with low latency. Information we are interested in would be

History of pretty much all events associated with validation, rewards, staking, nomination, and most importantly: slashing
Representation of validators, past and present, and with associations to events, and also some useful cumulative numbers, e.g for earning, number of events of various types.
Representation of nominators, same.

Question: here we really need to think about what nominators want to know when picking validators

CLI

Add support to the CLI for listing key information about the state of the chain and validation. This information should either be printable to screen, or dumpable to an output file as a structured format that can be easily programmatically parsed, so e.g. JSON.

The information that immediately comes to mind is:

Chain

Most recent finalised block by block height, including key information like the congestion variable, time, size, hash, height, tx count, etc.
Most recent unfinalized block, same information.
Presence of chain splits, i.e. finalisation of blocks on either side of a fork. Validation
Number of validators in the validation set, with key information associated with each, like amount of stake backing, number of nominators, commission, when they last checked in, unpaid reward, impending slashes, and whatever else would be useful.
Current overall rate of staking.
Impending slashes.

Most of this information should be available in the state, but adding in some extra historical information, e.g. metadata

Feedback

I am in particular looking for feedback on

nailing down more specifically what information we need to collect or display that is important for the relevant user, in particular nominators and the council trying to ensure the overall health and vitality of the system.
whether there is an alternative approach that would be better in terms of flexibility in how information can be used by other apps or services
how to detect and report on splits

This looks very closely related to what I had in mind as well!

All the stats/data can be fetched without query node support, but for a few reasons, it's not ideal. One being the staking.historyDepth, which denotes the amount of eras most of the relevant information can be fetched without having to query the state at some past block. Although 120 eras is quite a lot - at 30 days assuming "perfect" block production - it's still pretty valuable to keep it in the QN for faster and easier access.

Off the top of my head, here is what the QN should store:

block height and timestamp of the first block for each era (timestamp is stored for eras in the historydepth
- arguably also the block height and timestamp of the last block easier lookup, although it's redundant...
the sessions of said era
results, such as:
- total era points
- total stake
- total reward
- duration of era (in blocks, sessions and time [s])
the validators set for each era, including
- configurations:
- stash, controller, stake and destination
- sessionKeys
- nomination configurations (allowing/not, commission)
- results:
- era points
- whether rewards are claimed (arguably, when and how much)
- offences and slashes
- nominators of validator

In addition, keep track of all bonded stash account, and keep a record of activity.

blocks where the stash did certain actions, such as:
- bond
- bondExtra
- unbond (and when it can be withdrawn)
- withdrawUnbonded
- rebond
- payoutStakers
- maybe also validate, chill, nominate
- maybe also some other configurations, like setController, setPayee, session.setKeys
for each era where the stash was bonded:
- was validator/nominator/waiting list/bonded
- if validator:
- era points earned and expected era points (eg. eraRewardPoints.total / eraRewardPoints.individual.length
- nominators
- commission
- offences
- if nominator:
- stashes nominated (and stake distribution)
- rewards claimed for/by each
- either:
- controller
- stake
- claimed rewards
- destination
- slashes

I'm sure there are a few other things that could be added and that some of it could be scrapped, but would be pretty complete...

On the CLI side:

Most recent finalised block by block height, including key information like the congestion variable, time, size, hash, height, tx count, etc.

Most recent unfinalized block, same information.

I'd say this is a single command like chainData:blockInfo, which takes one of the below input(s):

block, by number or hash
blockRange, from-to as number or hash
bestNumber, boolean and empty returns latest finalized

Presence of chain splits, i.e. finalisation of blocks on either side of a fork.

This might require reading from the polkadot telemetry server to be complete I think, but I think you can get this from the nodes your endpoint is connected to.

Validation

Number of validators in the validation set, with key information associated with each, like amount of stake backing, number of nominators, commission, when they last checked in, unpaid reward, impending slashes, and whatever else would be useful.

Except the bolded one - which I either don't understand or is impossible, that is the important. It can be implemented already, but will be slow without QN support (especially for eras older than 1 month).

Current overall rate of staking.

...and expected rewards based on this

Impending slashes.

Yeah, offences and slashing spans as well.

Joystream / joystream

Draft: Validation Diagnostics #4510

Problem

Solution

Query Node

CLI

Feedback