jito-foundation / stakenet

Jito StakeNet
https://www.jito.network/stakenet/
Apache License 2.0
40 stars 17 forks source link

Feature: Add slot skip rates to ValidatorHistoryAccount #5

Open buffalu opened 7 months ago

buffalu commented 7 months ago

Is your feature request related to a problem? Please describe. The ValidatorHistoryAccount should track the block skip rate of every validator. It should be updated in semi-realtime.

Describe the solution you'd like There's a few pieces of information that need to get stored on-chain to do this.

  1. The leader schedule, which can be uploaded by the stake_authority, a permissioned authority that has the ability to update certain parameters that aren't stored on-chain. If it fits in a single account, it may make sense to store it there. If not, can potentially store it a PDA keyed by the validator's vote/identity account and the epoch with some lifetime associated with it to save on rent and state bloat. Need to keep in mind the account access patterns.
  2. The SlotHistory sysvar can be used to check if a slot was skipped or not. The API requires a slot, so the slot -> pubkey mapping is needed.

This will be a tricky one, looking forward to seeing the implementation :)

anoushk1234 commented 7 months ago

Hey @buffalu I'd like to take this up, can you please assign me?

anoushk1234 commented 6 months ago

@buffalu the ValidatorHistoryEntry has a size limit of 128, looking at the leader schedule for most validators, the space required to store that would be around 6-8kb which is a lot. A potential solution is to use a leader schedule account and store it in that.

Why do we need a slot to pubkey mapping? If we know the leader schedule and we have the slot history sysvar we can check if any slot in the schedule is missing from the sysvar

buffalu commented 6 months ago

A potential solution is to use a leader schedule account and store it in that.

This sounds reasonable to me, how are you thinking about the layout for that account? Is it HashMap<Pubkey, Vec> or something else?

anoushk1234 commented 6 months ago

A potential solution is to use a leader schedule account and store it in that.

This sounds reasonable to me, how are you thinking about the layout for that account? Is it HashMap<Pubkey, Vec> or something else?

I was thinking more along the lines of having a account per epoch per validator.

For epoch N you create a leader schedule account whos key is mapped to the ValidatorHistoryEntry and contains the epoch number as well.

#[account(zero_copy)
pub struct LeaderSchedule{
pub epoch: u64,
pub vote_identity: Pubkey,
pub leader_schedule: [u64; 20000] // currently highest stake validator has ~14k length schedule so this gives it a good margin.
}
pub struct ValidatorHistoryEntry{
...
pub leader_schedule: Pubkey
}

Per epoch cost would be ~ 0.0032 SOL.

ebatsell commented 6 months ago

@anoushk1234

Each validator with a ValidatorHistory account is given an index, which we can use to save a lot of space when storing the leader schedule and also use one global account rather than per validator. Also removes the need to save 32 bytes on the validator history account

We can invert the ValidatorHistory -> LeaderSchedule mapping to be LeaderSchedule -> ValidatorHistory like this:

#[account(zero_copy)
pub struct LeaderSchedule{
  pub epoch: u64,
  pub slot_offset: u64,
  // Each value in this array is the index of the leader for the slot at i + slot_offset
  pub leader_schedule: [u32; 432000]
}

Size: 432000 * 4 bytes + 8 + 8 = 1728016 bytes, less than 10MiB account limit

Then in an instruction for each validator you can loop through SlotHistory and check all slots where leader_schedule[i] == validator_history.index, to compute the previous epoch's skip rate. Check the ClusterHistory struct I added a few PRs ago for how this is accessed

You'd need to upload the LeaderSchedule across multiple TX's cause it's so big, and also initialize it with multiple instructions - check ReallocValidatorHistoryAccount for an example. You could also maybe reuse the same LeaderSchedule account each epoch but could make the timing for updates each epoch more complex