ethereum / pm

Project Management: Meeting notes and agenda items
Other
1.54k stars 313 forks source link

Ethereum Core Devs Meeting 118 Agenda #354

Closed timbeiko closed 3 years ago

timbeiko commented 3 years ago

Meeting Info

Agenda

  1. London Updates
    1. Ropsten Issue
    2. gasPrice for 1559 transactions. Comments against it:
  2. Other Discussion Items
    1. https://github.com/ethereum/pm/issues/356
    2. https://github.com/ethereum/pm/issues/361
    3. https://github.com/ethereum/pm/issues/360
    4. https://github.com/ethereum/pm/issues/357
      1. Announcements
AlexeyAkhunov commented 3 years ago

I am not sure I will be able to join the meeting, but I would like to say that there was an interesting discovery made during the Ropsten incident, which we need to be aware of. Because majority of miners use Geth, and Geth did not have the ability to repair Ropsten nodes that were past the bad block, miners had to perform full sync (not fast sync and not snap sync) before they could start mining on the correct chain. This means that if something like that were to happen on the main net, there would be no way back, and Geth's version (whatever it is) would need to become de-facto the consensus rule.

In general, I think we need to have a better understanding about "what we are going to do if X happens?" where X is any of the issues that happened on the test-net. Yes, it is unpleasant to think about it, but we need some pre-decisions to relieve Geth devs from the responsibility to make all the hard decisions on the spot, which would be incredibly stressful.

P.S. I have implemented functionality in Erigon today to be able to go back before any bad block in the past (and used it to repair Ropsten), but unfortunately we did not finish mining support for Erigon yet. And even if we did, it would take a while for miners to adopt our code. So this is not a proposed solution, but just extra info that such functionality may be possible to have.

AlexeyAkhunov commented 3 years ago

Ropsten incident also highlights another potential issue that may apply to many implementations (including currently Erigon). It seems that in the network where there are two competing chains co-existing, the minority parts seems to be very unstable, with nodes disconnecting each other and reconnecting all the time. This may be related to what I have observed I think: some nodes are propagating blocks/headers even though they are not on their best chain. This leads to these nodes being kicked out (disconnected) by other nodes on the correct chain. I am going to think how to make our implementation a bit more robust, but perhaps other needs to take a look as well.

poojaranjan commented 3 years ago

If time permits would like to make announcements for upcoming PEEPanEIP meetings for Merge & Block Gas Limit.

holiman commented 3 years ago

and Geth did not have the ability to repair Ropsten nodes that were past the bad block

That's not correct. There are basically a couple of things that can happen during a fork. I'll outline a couple of scenarios,

Synced node followed wrong chain

You were running geth, and were in sync. At block X, the fork happened. Your node followed the erroneous higher-td chain, and at block Z, you stop the node and update to the patched version.

Problem description; The node is still on the 'bad' chain. Solution: Do a debug.setHead{X-1) to jump to before the fork. This internally will rewind the chain to some state before X. It might not be X-1, since geth might not have the full state for that block, but it will have the state somewhere. Usually, geth flushes the state to disk every ~10K blocks (or whatever corresponds to 1 hour processing), and/or during shutdown. If geth is running in gcmode=archive, then it flushes every block.

Syncing in the presence of a wrong higher-td chain

You are syncing a geth-node, and a fork has occurred at block X. Since the fork has already happened, and the erroneous chain has higher TD, you will most likely wind up on the 'wrong' side of the chain, with a pivot block X+M. If this happens, you do not have any state for blocks <X+M, so you cannot do debug.setHead to to resolve the situation.

In this case, a resync is required. However, you need to prevent geth from winding up on the wrong side of the fork. This can be done with the whitelist command line parmeter.

$ geth -h | grep white
  --whitelist value                   Comma separated block number-to-hash mappings to enforce (<number>=<hash>)

So you'd do geth --whitelist 123123=0x2342fafa9af9af9af9af9af9

The whitelist means that geth, when peering with another peer, will ask the peer "what's your block 123123". If it gets a header back with a hash that doesn't match the whitelist, it willl disconnect from that peer. So essentially, the node will isolate itself from peers on the wrong chain, and only connect to peers that will deliver blocks from the shorter (but correct) chain.

timbeiko commented 3 years ago

Closed in favor of https://github.com/ethereum/pm/issues/365

kf106 commented 2 years ago

Does anyone have some clear instructions on how to actually fix your node if you upgraded after the split? With the actual numbers?

I am running 1.10.8, and upgraded after the split.

I noticed that the balances on my Ropsten node did not match those on MetaMask and Etherscan a couple of days ago, and that my block height was higher than Etherscan's, so clearly I was on the wrong chain.

But 3.5 GHsh of hashing power was going into continuing that chain, so there are clearly other people still mining away on it too.

The Meeting 118.md article confusingly talks about Ropsten and then lists Mainnet blocks and hashes at the end.

I thought I had understood the issue and how to correct it, and did the following:

Instead of a fixed chain, what I got was a segmentation violation error:

INFO [09-05|11:14:18.197] Looking for peers                        peercount=1 tried=25 static=0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1c0 pc=0xb3180f]

goroutine 1856 [running]:
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).findAncestorBinarySearch(0xc00037c540, 0xc034278270, 0x1, 0xa786c6, 0xa19561, 0xa19561, 0x0, 0x17c6020)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:966 +0x58f
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).findAncestor(0xc00037c540, 0xc034278270, 0xc002a518c0, 0xc002a51b00, 0x0, 0x0)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:818 +0x3a5
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).syncWithPeer(0xc00037c540, 0xc034278270, 0x6215089fee241794, 0x743c81e9df254112, 0xe8f5e2540c043313, 0xba9caf4ba6bf2ba, 0xc01709e260, 0x0, 0x0)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:475 +0x517
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).synchronise(0xc00037c540, 0xc03488f880, 0x40, 0x6215089fee241794, 0x743c81e9df254112, 0xe8f5e2540c043313, 0xba9caf4ba6bf2ba, 0xc01709e260, 0xc000000001, 0x0, ...)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:431 +0x3b0
github.com/ethereum/go-ethereum/eth/downloader.(*Downloader).Synchronise(0xc00037c540, 0xc03488f880, 0x40, 0x6215089fee241794, 0x743c81e9df254112, 0xe8f5e2540c043313, 0xba9caf4ba6bf2ba, 0xc01709e260, 0xc000000001, 0x4842c0, ...)
  github.com/ethereum/go-ethereum/eth/downloader/downloader.go:326 +0x8c
github.com/ethereum/go-ethereum/eth.(*handler).doSync(0xc004ca5b00, 0xc02ab9c300, 0x0, 0x0)
  github.com/ethereum/go-ethereum/eth/sync.go:324 +0x125
github.com/ethereum/go-ethereum/eth.(*chainSyncer).startSync.func1(0xc0001e0bd0, 0xc02ab9c300)
  github.com/ethereum/go-ethereum/eth/sync.go:300 +0x38
created by github.com/ethereum/go-ethereum/eth.(*chainSyncer).startSync
  github.com/ethereum/go-ethereum/eth/sync.go:300 +0x76

I'm now using the following whitelist parameter and resyncing from scratch instead:

--whitelist 10679538=0x569dccc25294768c23249db843ef7156e8e7c6c94cb82cd84a833f9c3e1d72e5

I hope I've got that right.