Closed prestonvanloon closed 3 years ago
Small (bug related?) issue in your description:
The genesis time is not 1605700800
, that is the minimum genesis time.
The actual genesis time (known as state.genesis_time
) is 1605700807 # (Wednesday, November 18, 2020 12:00:07 PM)
This is due to it being triggered by the first eth1 block after the minimum genesis time, minus eth1 delay.
@protolambda Thanks for that check. I'll update description, i was off by a few seconds. It doesn't explain the 30mins discrepancy though!
Also note that the genesis delay was set to 5 days. This may relate to the nimbus bug!
Edit: exactly 5 days (genesis delay) and ~9 hours (time since genesis now). Definitely a bug. I wonder if the nimbus bug gets resolved without action after first voting period. Could be that the eth1 vote data was prepared too early.
Also looks like I have the assumption wrong, we want the highest block up to the voting round start time. Editing that...
Thinking about this more, Prysm should be voting with LH+Teku. Assigning @rkapka to look at that. It seems our voting with the majority is not working properly.
Assigning to @nisdas to look into our deviation of mainnet spec configuration of 14s for eth1 block times. https://github.com/prysmaticlabs/prysm/blob/095c4d5dd51a11c8d63c58e01a4a4774a0b5ad27/shared/params/mainnet_config.go#L85
The reason I am filing a bug in Prysm is that I think that Prysm's voting with the majority mechanism should consider votes from the previous voting period to be acceptable.
This would be against the specification: https://github.com/ethereum/eth2.0-specs/blob/dev/specs/phase0/validator.md#get_eth1_data
Specifically take a look at the definition of is_candidate_block
and
votes_to_consider = [
get_eth1_data(block) for block in eth1_chain
if (
is_candidate_block(block, period_start)
# Ensure cannot move back to earlier deposit contract states
and get_eth1_data(block).deposit_count >= state.eth1_data.deposit_count
)
]
# Valid votes already cast during this period
valid_votes = [vote for vote in state.eth1_data_votes if vote in votes_to_consider]
Lighthouse and Teku are voting on a block that is 30 minutes older than what Prysm thinks is the start of the eth1voting period.
It seems that either both clients don't follow the spec or Prysm's voting period start time is incorrect.
Lighthouse and Teku are voting on a block that is 30 minutes older than what Prysm thinks is the start of the eth1voting period.
This actually makes sense. The voting period starts and we are supposed to consider the highest eth1 block before the start of the voting period.
I restarted all prylabs nodes and they are suddenly voting for the majority block.
I could not reproduce this locally either. I am thinking there is some caching issue of some kind, but overall the majority vote determining algorithm is correct. Lowering to P1, but still with some urgency before v1 mainnet.
This is fixed for pyrmont by #7861.
Leaving open until we fix the mainnet spec difference.
🐞 Bug Report
Description
At the moment, eth1data consensus is not being reached.
Nimbus is voting for a very old block, 5 days old. https://goerli.etherscan.io/block/0x0566a632683a61180a735bf72f7daea59deeb9781d4a43be41a86d484df6287f
Lighthouse and Teku are voting on a block that is 30 minutes older than what Prysm thinks is the start of the eth1voting period. https://goerli.etherscan.io/block/0x3da88bec112d32d5e81741dfd395df14381346ebf48b91b1bfa111fdb7a69b5f (Nov-18-2020 10:18:00 AM +UTC)
Prysm thinks the start of this eth1data voting period is
Genesis time is 1605700807 Slot is 2048 Eth1FollowDistance is 2048
So the current start slot time minus the eth1followdistance comes out to be
Wed Nov 18 2020 02:51:51 GMT-0800 (Pacific Standard Time)
.So Prysm considers this block to be the highest block in the previous eth1data voting period. https://goerli.etherscan.io/block/0x227ed776f1e4a705ec8fae4ac1342d316be3f4cb9bf5248a9848bf344776f9c5 (Nov-18-2020 10:51:45 AM +UTC)
The reason I am filing a bug in Prysm is that I think that Prysm's voting with the majority mechanism should consider votes from the previous voting period to be acceptable. Otherwise we could see these types of issues or off by one issues when multiple blocks are produced in quick succession on eth1.
Another thing to consider, if client teams do not have the same eth1 block time of 14s then it will be even more difficult to reach eth1data consensus at mainnet. Some additional context: https://github.com/ethereum/eth2.0-specs/issues/2132 Prysm MUST conform with other clients in this regard for mainnet configuration. As such, I am marking this as a P0.
Has this worked before in a previous version?
Maybe when Prysm was the majority then Lighthouse and Teku would favor Prysm's block vote because it is still in range of their interpreted eth1data voting period.
🔬 Minimal Reproduction
This is occuring in Pyrmont testnet.
🔥 Error
N/A
🌍 Your Environment
Operating System:
N/A
What version of Prysm are you running? (Which release)
v1.0.0-beta.3
Anything else relevant (validator index / public key)?