paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.77k stars 634 forks source link

Incorrect Block Author Reward During Era Transition in Polkadot and Other Chains #4923

Open ToufeeqP opened 2 months ago

ToufeeqP commented 2 months ago

Is there an existing issue?

Experiencing problems? Have you tried our Stack Exchange first?

Description of bug

During era transitions in PoS-based chains utilizing the BABE protocol for block production, the authorship of blocks is sometimes incorrectly identified. This leads to the block author reward being misallocated. Specifically, the reward may be given to the wrong authority or not given at all for certain era-transitioning blocks.

As a result of this bug, many of the era transition blocks have either rewarded the incorrect authority or were not rewarded at all. The following list of blocks is non-exhaustive:

Steps to reproduce

  1. Set up a PoS-based chain using Substrate with BABE as the block production authority with multiple validators.
  2. Modify the staking dynamics to change the vote weight of the existing validator set or increase the number of validators
  3. Allow the chain to progress to an era transition.
  4. Observe the authorship rewards during the blocks where the new era is enacted.
  5. Verify the author of a block by decoding the SEAL digest and check the author who was rewarded by the authorship pallet.
chungquantin commented 2 months ago

I think it would be clearer if you can attach a code snippet for the PoS chain runtime & node service configuration.

burdges commented 2 months ago

This is bizarre because you'd expect the chain tracks its own block author rewards on-chain.

ToufeeqP commented 2 months ago

I think it would be clearer if you can attach a code snippet for the PoS chain runtime & node service configuration.

This issue applies to both the Polkadot and Kusama chains as well. I have already provided some blocks on Polkadot as references where the block authoring reward was either not awarded to the author or was given to a different validator than the one who authored the block.

bkchr commented 1 month ago

Yeah the problem here is that when a session enacts a new validator set and the call to determine the block author comes later, we will already fetch the validator from the new set. The solution is that we buffer the old set in the current block.

eagr commented 2 weeks ago

To ensure I understand this correctly.

When allocating block rewards, the author is fetched using the authorship pallet. If the author id is not already available in storage, the pallet will try to determine the author from the block digest, the result of which is the index of the responsible validator. The index will then be used to find the author in the current validator set, which may or may not coincide with the previous session. This is why it sometimes fails to identify the correct validator to reward or at all. So the session pallet would also need to store the old validator set for rewardee lookup in this case.

Curious, why isn’t the author account id in the digest in the first place?

bkchr commented 2 weeks ago

Curious, why isn’t the author account id in the digest in the first place?

Because the account id is not known to the node side. The node side only knows the session key, which is different to the account.

eagr commented 5 days ago

Because the account id is not known to the node side. The node side only knows the session key, which is different to the account.

Ah, I see.


It seems to me the buffer of the previous validator set is only useful for determining the author of the first block of each session, which is kinda trivial (but a bug is a bug). Do you it's worth the overhead of extra storage? @bkchr @burdges

The best I could find about determining whether a block is the first in a session is to check if authorities change is signaled in the digest. But that's specific to consensus engines. Do you think it makes sense to add a generic interface for that, similar to trait FindAuthor? Is if block_number % blocks_per_session == 0 reliable for this?

burdges commented 5 days ago

It's not storage overhead that's problematic here. We already have two validator sets simultaniously because elections require doing so. Afaik approvals checks needs the old authorities sets anyways, likely a third validator sets, so maybe someone can tell you how to access it.

eagr commented 5 days ago
impl<T: Config, Inner: FindAuthor<u32>> FindAuthor<T::ValidatorId>
    for FindAccountFromAuthorIndex<T, Inner>
{
    fn find_author<'a, I>(digests: I) -> Option<T::ValidatorId>
    where
        I: 'a + IntoIterator<Item = (ConsensusEngineId, &'a [u8])>,
    {
        let i = Inner::find_author(digests)?;

        fn first_block_of_session(digests: I) -> bool {
            // guess this wouldn't work? as a slot can be without a block
            frame_system::Pallet::<T>::block_number() % SLOTS_PER_SESSION == 0

            // or
            // check if authorities change is signaled in digests
        }

        let validators = if first_block_of_session(digests) {
            OldValidators::<T>::get()
        } else {
            Validators::<T>::get()
        };
        validators.get(i as usize).cloned()
    }
}

Thinking about something like this: OldValidators will be populated with the value of Validators on each rotation. Idk anything about the multiple sets of validators, but if the Validators value is being used for author lookup, guess OldValidators should be fine too, no?

burdges commented 5 days ago

You missunderstood my comment:

We should store the different session keys for each validator together in one structure of course, otherwise wastes lookup space. An OldValidators must already exist in approval voting, so all other session keys should live there too.

An OldValidators must already exist for equivocations slashing too, so the exact keys your interested in here, well unless we produce a history proof off-chain, but likely no.

Just use the OldValidators that already exists.

Also, our storage lacks O(1) moves, and future storages would lack them too, so Validators and OldValidators should not be a fixed storage path, but their storage prefixs should presumably be constructed dynamically based upon the session. I'm maybe wrong on this, maybe we copy them since they're only 1000 records, but more likely we keep them for 28 days.

eagr commented 4 days ago

What you said makes sense. If the set is already stored somewhere, for sure we shouldn't duplicate it. I've tried searching for auth, valid, set, key in Substrate, and gone through the storage items, but I couldn't find a storage that corresponds to OldValidators anywhere. So yeah, if that's the case, I'd need some pointer to how to access it.