ethereum / py-evm

A Python implementation of the Ethereum Virtual Machine
https://py-evm.readthedocs.io/en/latest/
MIT License
2.27k stars 654 forks source link

Ropsten sync stuck at #3285226 - Unable to find common ancestor #1320

Closed MysticRyuujin closed 6 years ago

MysticRyuujin commented 6 years ago

0.1.0a15/linux/cpython3.6.5

What is wrong?

I'm trying to sync the full Ropsten chain but it appears to have gotten stuck and I'm not sure what to do to fix it.

    INFO  09-25 18:24:57      server  enode://cd2a844398bb2e5870b0968396665ce4881918062aab2edc2a4210e7d917e5a83d5e64010f6cc04f5d4b31e67f7fb958f16ce692abc3c6b29913044c3ddac2f9@0.0.0.0:30303
    INFO  09-25 18:24:57      server  network: 3
    INFO  09-25 18:24:57      server  peers: max_peers=25
    INFO  09-25 18:24:57   discovery  Preferred peers: (<Node(0x35b3255b8ee2c15e6d8489321f65a471c80d7cf4bb38d0aaf2953d9b0004196dfe76b50625b202f24626f1ba9d393c17e624c6bab549f5aa5068c71fe5a08151@192.168.15.33:30303)>,)
    INFO  09-25 18:24:57        peer  Connected peers: 0 inbound, 0 outbound
    INFO  09-25 18:24:57   discovery  boostrapping with (<Node(0x30b7ab30a01c124a6cceca36863ece12c4f5fa68e3ba9b0b51407ccc002eeed3b3102d20a88f1c1d3c3154e2449317b8ef95090e77b312d5cc39354f86d5d606@52.176.7.10:30303)>, <Node(0x865a63255b3bb68023b6bffd5095118fcc13e79dcf014fe4e47e065c350c7cc72af2e53eff895f11ba1bbb6a2b33271c1116ee870f266618eadfc2e78aa7349c@52.176.100.77:30303)>, <Node(0x6332792c4a00e3e4ee0926ed89e0d27ef985424d97b6a45bf0f23e51f0dcb5e66b875777506458aea7af6f9e4ffb69f43f3778ee73c81ed9d34c51c4b16b0b0f@52.232.243.152:30303)>, <Node(0x94c15d1b9e2fe7ce56e458b9a3b672ef11894ddedd0c6f247e0f1d3487f52b66208fb4aeb8179fce6e3a749ea93ed147c37976d67af557508d199d9594c35f09@192.81.208.223:30303)>)
    INFO  09-25 18:24:57     service  Starting fast-sync; current head: #3285226
    INFO  09-25 18:24:57        peer  Adding ETHPeer <Node(0x35b3@192.168.15.33)> to pool
    INFO  09-25 18:24:57       chain  Starting sync with ETHPeer <Node(0x35b3@192.168.15.33)>
 WARNING  09-25 18:24:57       chain  Unable to find common ancestor betwen our chain and ETHPeer <Node(0x35b3@192.168.15.33)>
    INFO  09-25 18:24:58        peer  ETHPeer <Node(0x35b3@192.168.15.33)> finished, removing from pool
    INFO  09-25 18:24:59   discovery  Invariant: received ping from this_node: <Node(0xcd2a@127.0.0.1)>
    INFO  09-25 18:25:10        peer  Adding ETHPeer <Node(0x6332@52.232.243.152)> to pool
    INFO  09-25 18:25:10       chain  Starting sync with ETHPeer <Node(0x6332@52.232.243.152)>
 WARNING  09-25 18:25:11       chain  Unable to find common ancestor betwen our chain and ETHPeer <Node(0x6332@52.232.243.152)>
    INFO  09-25 18:25:38        peer  Adding ETHPeer <Node(0x865a@52.176.100.77)> to pool
    INFO  09-25 18:25:38       chain  Starting sync with ETHPeer <Node(0x865a@52.176.100.77)>
 WARNING  09-25 18:25:38       chain  ETHPeer <Node(0x865a@52.176.100.77)> announced difficulty 11348680748315317, but didn't return any headers after <BlockHeader #3285226 85538caa>@7977680760997843
    INFO  09-25 18:25:38        peer  ETHPeer <Node(0x865a@52.176.100.77)> finished, removing from pool
    INFO  09-25 18:25:57        peer  Connected peers: 0 inbound, 1 outbound
    INFO  09-25 18:25:57        peer  Peer subscribers: 1, longest queue: FastChainSyncer(0)
    INFO  09-25 18:26:28        peer  ETHPeer <Node(0x6332@52.232.243.152)> finished, removing from pool
MysticRyuujin commented 6 years ago

Nevermind...it just started working again out of nowhere...

MysticRyuujin commented 6 years ago

So it did start syncing again but it's stuck again and it's been stuck for a while now...at least a day now, maybe 2...

Block is now 3370936

0.1.0a16

    INFO  09-28 14:17:29        FullNodeSyncer  Starting regular sync; current head: #3370936
    INFO  09-28 14:17:29           ETHPeerPool  Adding ETHPeer <Node(0x35b3@192.168.15.33)> to pool
    INFO  09-28 14:17:29    RegularChainSyncer  Starting sync with ETHPeer <Node(0x35b3@192.168.15.33)>
 WARNING  09-28 14:17:30    RegularChainSyncer  Unable to find common ancestor betwen our chain and ETHPeer <Node(0x35b3@192.168.15.33)>
    INFO  09-28 14:17:30           ETHPeerPool  ETHPeer <Node(0x35b3@192.168.15.33)> finished, removing from pool
    INFO  09-28 14:17:50           ETHPeerPool  Adding ETHPeer <Node(0x865a@52.176.100.77)> to pool
    INFO  09-28 14:17:50    RegularChainSyncer  Starting sync with ETHPeer <Node(0x865a@52.176.100.77)>
 WARNING  09-28 14:17:50    RegularChainSyncer  ETHPeer <Node(0x865a@52.176.100.77)> announced difficulty 11592197375320558, but didn't return any headers after <BlockHeader #3370936 6482af4e>@7977888832680901
    INFO  09-28 14:17:50           ETHPeerPool  ETHPeer <Node(0x865a@52.176.100.77)> finished, removing from pool
    INFO  09-28 14:17:51           ETHPeerPool  Adding ETHPeer <Node(0x05d7@213.207.92.204)> to pool
    INFO  09-28 14:17:51    RegularChainSyncer  Starting sync with ETHPeer <Node(0x05d7@213.207.92.204)>
 WARNING  09-28 14:18:11    RegularChainSyncer  Timeout waiting for header batch from ETHPeer <Node(0x05d7@213.207.92.204)>, aborting sync
    INFO  09-28 14:18:11           ETHPeerPool  ETHPeer <Node(0x05d7@213.207.92.204)> finished, removing from pool
    INFO  09-28 14:18:29           ETHPeerPool  Connected peers: 0 inbound, 0 outbound
    INFO  09-28 14:18:29           ETHPeerPool  Peer subscribers: 1, longest queue: RegularChainSyncer(0)
    INFO  09-28 14:18:37           ETHPeerPool  Adding ETHPeer <Node(0x865a@52.176.100.77)> to pool
    INFO  09-28 14:18:37    RegularChainSyncer  Starting sync with ETHPeer <Node(0x865a@52.176.100.77)>
 WARNING  09-28 14:18:37    RegularChainSyncer  ETHPeer <Node(0x865a@52.176.100.77)> announced difficulty 11592247233922476, but didn't return any headers after <BlockHeader #3370936 6482af4e>@7977888832680901
    INFO  09-28 14:18:37           ETHPeerPool  ETHPeer <Node(0x865a@52.176.100.77)> finished, removing from pool
    INFO  09-28 14:18:52           ETHPeerPool  Adding ETHPeer <Node(0x865a@52.176.100.77)> to pool
    INFO  09-28 14:18:52    RegularChainSyncer  Starting sync with ETHPeer <Node(0x865a@52.176.100.77)>
 WARNING  09-28 14:18:53    RegularChainSyncer  ETHPeer <Node(0x865a@52.176.100.77)> announced difficulty 11592257215383829, but didn't return any headers after <BlockHeader #3370936 6482af4e>@7977888832680901
    INFO  09-28 14:18:53           ETHPeerPool  ETHPeer <Node(0x865a@52.176.100.77)> finished, removing from pool
    INFO  09-28 14:19:29           ETHPeerPool  Connected peers: 0 inbound, 0 outbound
    INFO  09-28 14:19:29           ETHPeerPool  Peer subscribers: 1, longest queue: RegularChainSyncer(0)
MysticRyuujin commented 6 years ago

For an FYI 192.168.15.33 is my Harmony node, on the same subnet, configured as a preferred peer. Which is 100% fully in sync on Ropsten, you can see it here: https://ropsten-stats.parity.io/

pipermerriam commented 6 years ago

Can you pull the block hash for the block you are stuck on? Does it match up with expected one (from something like etherscan)?

MysticRyuujin commented 6 years ago

No, actually, it doesn't look like it does... I somehow went down a bad fork and can't recover?

https://ropsten.etherscan.io/block/3370936 Expected Hash: 0xedde01eb93f4444602e31fe48cba8bde46c06d755ce86f210063887db7181408

In [4]: w3.eth.getBlock(3370936)
Out[4]: 
AttributeDict({'difficulty': 2455632,
 'extraData': HexBytes('0xda8301080b846765746888676f312e31302e328777696e646f7773'),
 'gasLimit': 4712388,
 'gasUsed': 0,
 'hash': HexBytes('0x6482af4e2b8ad85a0631403d7eed1b9349c053fe47c9e266177b857e69d1c3ca'),

I went back to see where it went wrong...

Block 2933474 is OK: Expected Hash: 0x534761a2df8250a3855aaa7a6fff825b553610bff1651151c9c47d0bda7c4b6d

In [30]: w3.eth.getBlock(2933474)
Out[30]: 
AttributeDict({'difficulty': 339187343,
 'extraData': HexBytes('0xd583010b008650617269747986312e32342e31826c69'),
 'gasLimit': 4700036,
 'gasUsed': 338496,
 'hash': HexBytes('0x534761a2df8250a3855aaa7a6fff825b553610bff1651151c9c47d0bda7c4b6d')

Block 2933475 is not in sync... Expected Hash: 0x968061cb7929cc2fcf15139b6cd4f2777cd76466756f79ec80f3c4961c800ca5

In [31]: w3.eth.getBlock(2933475)
Out[31]: 
AttributeDict({'difficulty': 338856107,
 'extraData': HexBytes('0x414952412f7630'),
 'gasLimit': 4700036,
 'gasUsed': 3025003,
 'hash': HexBytes('0xe296755918d3dbf5d917cb826bba6191289eff2196af536b8889f94972111dd4'),

I somehow went more than 400,000 blocks out of sync? That seems odd, no?

pipermerriam commented 6 years ago

Well, for ropsten, I don't think it is uncommon for there to be very large forks like this. There have been times when spam was cleaned up by a coordinated group mining a massive re-org.

Trinity doesn't deal well with re-orgs so... You'll either have to reset to genesis or be clever and do some direct db modifications to roll your chain back to the last good block.

MysticRyuujin commented 6 years ago

Well given that I don't know how to do the latter, that sucks 😄

pipermerriam commented 6 years ago

lemme see if I can hack something together for you.

pipermerriam commented 6 years ago

I don't know if this will actually work so... worth a try but might end up with you needing to just reset.

You should be able to run ipython -i scripts/db-shell.py --ropsten which will drop you into a python repl witha chaindb object in the context.

You should verify you are indeed connected to the ropsten chain by matching up the returned block hashes with what you expect to see.

You can pull the block header of 2933474 with:

>>> header = chaindb.get_canonical_block_header_by_number(2933474)

Then you can force your db to think that is the head block using:

>>> chaindb._set_as_canonical_chain_head(header.hash)
pipermerriam commented 6 years ago

This is going to leave 400,000 blocks of cruft dangling in your database for which there's no good way to remove them...

MysticRyuujin commented 6 years ago

I installed via pip3, not sure what scripts dir is in that context. No biggie, I'll just re-sync.

Thanks anyway.

carver commented 6 years ago

BTW, I had this same issue (same block). Not surprising that a lot of people would run into this, since it's one of the boot nodes that is leading us astray. Of course we could remove the boot node, but I actually think it's a useful test case. Some possible mitigations are discussed here: https://github.com/ethereum/py-evm/issues/1324#issuecomment-425192348

pipermerriam commented 6 years ago

oh, for future reference you'd launch with python -i -m scripts.db-shell if installed with pip.