in #34829 and #34972 we implemented VoteState::deserialize_into(), a custom parser intended to be suitable for usage in a bpf context. we now want to use the new parser everywhere because it is 7-20x faster than bincode depending on the input
HANA custom vs bincode
* we win: 10000, they win: 0
* our max 22.422µs, their max 471.89µs
* our avg 5.76µs, their avg 35.956µs
* avg speedup 30.195µs
the advantage remains even if we compare our VoteState parser to bincode parsing to VoteStateVersions without calling convert_to_current()
HANA custom vs bincode, wins: 10000 vs 0, max: 22.147µs vs 96.273µs, avg: 4.16µs vs 38.144µs
it also uses half as much memory for V1_14_11 because conversion to Current happens during parsing
Proposed Solution
change the impl of VoteState::deserialize() from bincode to our new parser, and use the new VoteState::deserialize() wherever we can
based on my research into the (various) different mechanisms to deserialize VoteState, this is the basic situation:
stake_state::create_account(). this requires no feature gate. its used twice in real code. but in both cases, its safe, because the vote account is also created by us
VoteAccount::vote_state() in vote/src/vote_account.rs. there isnt a path to feature gating here. this is used extensively in the most sensitive parts of the code, including consensus and blockstore, which are feature-unaware
Bank::_load_vote_and_stake_accounts in runtime/src/bank.rs. feature gate is trivial because we are inside Bank. there is no need for it tho because only success is checked
vote_state::from() in programs/vote/src/vote_state/mod.rs. the only important seeming non-test use (aside from stake_state::create_account()) is active_vote_account_exists_in_bank in core/src/validator.rs. feature gating here is trivial because it has the bank, but as above this only checks success. check_vote_account in validator/src/bootstrap.rs also uses it and doesnt have bank. but the "bootstrap" designation gives me the impression its fine because its part of setup code
numerous uses of get_state::<VoteStateVersions>() in programs/vote/src/vote_state/mod.rs. these are trivial to feature gate because they all have access to the feature set. most uses have no performance benefit but verify_and_get_vote_state is part of vote processing
there are a couple uses in the stake program we can change but im inclined to leave them as-is so we dont have a feature gate blocking stake program bpf-ization. in any event there is no performance benefit to be gained here
originally i believed that we needed a feature gate for the vote program changes, because of a couple patterns where VoteStateVersions was decoded and then only later converted to Current. but it appears now that nothing important happens in between. so a feature gate comes down to "we expect the parser to behave identically, and its already in tower and blockstore, so should go all the way" vs "minimize the ungated change surface for the sake of prudence." hoping for input here
the new parser always produces an identical parse to bincode for well-formed inputs. the potential risk is that in some cases it behaves differently for certain malformed inputs that bincode would accept, particularly:
if is_empty is set, the new parser will skip deserializing prior_voters whereas bincode will deserialize the (invalid) data
if prior_voters is spare, ie there are (Pubkey::default(), 0, 0) values followed by real values, the new parser will error, whereas bincode will deserialize the (invalid) data. i am probably going to remove this tho because it adds little and these should not be capable of existing anyway removed this
these differences should not affect anything because presumably maliciously crafted accounts not owned by the vote program should be rejected by an owner check, and inserting a prior voter with a zero epoch doesnt appear to be possible because the code is well-encapsulated. but given how extensive this change is, im trying to be maximally cautious
update: the only difference is now the is_empty check. at this point im leaning toward no feature gate
as a side note, the parser falls back to bincode for V0_23_5, but ive confirmed there are no such accounts (at least delivered by the getVoteAccounts rpc method) on devnet, testnet, or mainnet-beta
there are some complications revolving around is_uninitialized() in VoteStateVersions because an uninitialized V0_23_5 account converted to current becomes "initialized", because is_uninitialized() (presumably erroneously) only checks if voters is empty, convert_to_current() put the default pubkey/zero epoch "voter" in the btreemap, and changing convert_to_current() to not do this breaks other things
my current plan for testing the parts of the change that cannot be feature-gated is to cherrypick the new parser into a 1.17 branch and run a validator built off that codebase for several epochs to confirm correct behavior is adhered to
final update: ive decided to forgo the feature gate because a) the parser will already be in so many codepaths without it that it only really provides false security, and b) i made some changes to its behavior such that it should now succeed or fail exactly in line with bincode
Problem
in #34829 and #34972 we implemented
VoteState::deserialize_into()
, a custom parser intended to be suitable for usage in a bpf context. we now want to use the new parser everywhere because it is 7-20x faster than bincode depending on the inputthe advantage remains even if we compare our
VoteState
parser to bincode parsing toVoteStateVersions
without callingconvert_to_current()
it also uses half as much memory for
V1_14_11
because conversion toCurrent
happens during parsingProposed Solution
change the impl of
VoteState::deserialize()
from bincode to our new parser, and use the newVoteState::deserialize()
wherever we canbased on my research into the (various) different mechanisms to deserialize
VoteState
, this is the basic situation:stake_state::create_account()
. this requires no feature gate. its used twice in real code. but in both cases, its safe, because the vote account is also created by usVoteAccount::vote_state()
invote/src/vote_account.rs
. there isnt a path to feature gating here. this is used extensively in the most sensitive parts of the code, including consensus and blockstore, which are feature-unawareBank::_load_vote_and_stake_accounts
inruntime/src/bank.rs
. feature gate is trivial because we are inside Bank. there is no need for it tho because only success is checkedvote_state::from()
inprograms/vote/src/vote_state/mod.rs
. the only important seeming non-test use (aside fromstake_state::create_account()
) isactive_vote_account_exists_in_bank
incore/src/validator.rs
. feature gating here is trivial because it has the bank, but as above this only checks success.check_vote_account
invalidator/src/bootstrap.rs
also uses it and doesnt have bank. but the "bootstrap" designation gives me the impression its fine because its part of setup codeget_state::<VoteStateVersions>()
inprograms/vote/src/vote_state/mod.rs
. these are trivial to feature gate because they all have access to the feature set. most uses have no performance benefit butverify_and_get_vote_state
is part of vote processingoriginally i believed that we needed a feature gate for the vote program changes, because of a couple patterns where
VoteStateVersions
was decoded and then only later converted toCurrent
. but it appears now that nothing important happens in between. so a feature gate comes down to "we expect the parser to behave identically, and its already in tower and blockstore, so should go all the way" vs "minimize the ungated change surface for the sake of prudence." hoping for input herethe new parser always produces an identical parse to bincode for well-formed inputs. the potential risk is that in some cases it behaves differently for certain malformed inputs that bincode would accept, particularly:
is_empty
is set, the new parser will skip deserializingprior_voters
whereas bincode will deserialize the (invalid) dataifremoved thisprior_voters
is spare, ie there are(Pubkey::default(), 0, 0)
values followed by real values, the new parser will error, whereas bincode will deserialize the (invalid) data. i am probably going to remove this tho because it adds little and these should not be capable of existing anywaythese differences should not affect anything because presumably maliciously crafted accounts not owned by the vote program should be rejected by an owner check, and inserting a prior voter with a zero epoch doesnt appear to be possible because the code is well-encapsulated. but given how extensive this change is, im trying to be maximally cautiousupdate: the only difference is now the
is_empty
check. at this point im leaning toward no feature gateas a side note, the parser falls back to bincode for
V0_23_5
, but ive confirmed there are no such accounts (at least delivered by thegetVoteAccounts
rpc method) on devnet, testnet, or mainnet-betathere are some complications revolving around
is_uninitialized()
inVoteStateVersions
because an uninitializedV0_23_5
account converted to current becomes "initialized", becauseis_uninitialized()
(presumably erroneously) only checks if voters is empty,convert_to_current()
put the default pubkey/zero epoch "voter" in the btreemap, and changingconvert_to_current()
to not do this breaks other thingsmy current plan for testing the parts of the change that cannot be feature-gated is to cherrypick the new parser into a 1.17 branch and run a validator built off that codebase for several epochs to confirm correct behavior is adhered to
final update: ive decided to forgo the feature gate because a) the parser will already be in so many codepaths without it that it only really provides false security, and b) i made some changes to its behavior such that it should now succeed or fail exactly in line with bincode