ethereum / go-ethereum

Go implementation of the Ethereum protocol
https://geth.ethereum.org
GNU Lesser General Public License v3.0
47.55k stars 20.13k forks source link

geth --fast stalls before crossing finish line #15001

Closed Dirksterson closed 6 years ago

Dirksterson commented 7 years ago

System information

Geth version: geth version 1.5.9-stable, Go1.7.4 OS & Version: OSX 10.12.6 MacMini 4GB RAM (latest MacMini doesn't support field RAM upgrade anymore) VDSL connection with an average of 20-40Mbit throughput. Ethereum Wallet 0.9.0 Commit hash : (if develop)

Expected behaviour

fast sync to current latest block followed by auto disabling

Actual behaviour

stalling from a few thousand blocks up to a few hundred to current latest block. Tries to catch up to latest block, but number of new blocks is greater than the speed of adding fast blocks. Never auto disables fast sync mode.

Steps to reproduce the behaviour

Removedb and geth --fast --cache=1024. 5 times on that machine over the last weeks.

Fast sync is already my workaround, starting a fresh fast sync from scratch. Before I was unsuccessful on that machine trying to sync with existing blockchain data instead. This was also a lost race of catching up to the latest block on that machine. This workaround was good until now.

Today even the workaround in fast sync mode (cache -1024) will not completely load the blockchain anymore. It catches up some hundred blocks to the latest block and stalls for hours. By the time it catches up a few hundred blocks, the latest block moved ahead again. The closer geth is getting to import to the latest block (at time of writing 4173161), the slower it gets. It does not catch up anymore. Tried 5 times now over the last weeks and giving up at around 4-5 days each.

Does the machine not meet todays minimum hardware requirement anymore or is this a major bug?

Backtrace

latest block 13 hours ago (!)

I0818 00:15:26.444933 core/blockchain.go:805] imported 148 receipts in 2.775s. #4169952 [e3f556fc… / 36f4d3c9…]

...

latest header chain 50 minutes ago

I0818 12:47:45.107445 core/headerchain.go:342] imported 1 headers in 4.954ms. #4173009 [350d1426… / 350d1426…]

...

currently only importing nothing but state entries

I0818 13:36:41.103101 eth/downloader/downloader.go:966] imported 172 state entries in 10.009s: processed 10010213, pending at least 129361 I0818 13:36:41.103131 eth/downloader/downloader.go:966] imported 384 state entries in 783.519ms: processed 10010597, pending at least 129361 I0818 13:36:41.103154 eth/downloader/downloader.go:966] imported 381 state entries in 6.963s: processed 10010978, pending at least 129361 I0818 13:36:41.103167 eth/downloader/downloader.go:966] imported 25 state entries in 87.654ms: processed 10011003, pending at least 129360 I0818 13:36:46.014244 eth/downloader/downloader.go:966] imported 384 state entries in 2.482s: processed 10011387, pending at least 127584 I0818 13:36:49.074483 eth/downloader/downloader.go:966] imported 381 state entries in 7.082s: processed 10011768, pending at least 127105 I0818 13:36:49.074553 eth/downloader/downloader.go:966] imported 384 state entries in 7.971s: processed 10012152, pending at least 127105 I0818 13:36:49.074574 eth/downloader/downloader.go:966] imported 384 state entries in 3.772s: processed 10012536, pending at least 127105 I0818 13:36:49.074603 eth/downloader/downloader.go:966] imported 162 state entries in 5.822s: processed 10012698, pending at least 127105 I0818 13:36:49.074622 eth/downloader/downloader.go:966] imported 25 state entries in 4.050s: processed 10012723, pending at least 127105 I0818 13:36:49.074639 eth/downloader/downloader.go:966] imported 381 state entries in 3.060s: processed 10013104, pending at least 127105 I0818 13:36:49.074742 eth/downloader/downloader.go:966] imported 85 state entries in 7.117s: processed 10013189, pending at least 127105 I0818 13:36:49.074765 eth/downloader/downloader.go:966] imported 375 state entries in 2.219s: processed 10013564, pending at least 127105 I0818 13:36:49.074782 eth/downloader/downloader.go:966] imported 87 state entries in 3.915s: processed 10013651, pending at least 127105 I0818 13:36:49.074795 eth/downloader/downloader.go:966] imported 23 state entries in 271.734ms: processed 10013674, pending at least 127104

ainsleys commented 6 years ago

I was able to successfully sync on a Macbook Pro, Sierra, 16mb ram. Steps I took: 1- free up machine from all other noncritical memory-using tasks 2- delete chaindata folder 3- fast sync using geth 4- do not open Mist browser until process is completed.

I synced before 24 h had passed.

holiman commented 6 years ago

There are 50+ million state entries to download. Downloading blocks and receipts finishes a lot faster, but it's not done until all state entries are downloaded. This takes time.

ainsleys commented 6 years ago

Yes, it's a good point. I think there are two main points of confusion for people: 1- not understanding the normal behavior of fast sync (to switch to normal sync at a certain point, thus appearing to suddenly and drastically slow down just before completion) 2- a strange UI characteristic on the Mist load screen, where the size of "downloading chain structure" appears to outpace the amount downloaded. Anyway, happy syncing :).

tuxx42 commented 6 years ago

is this problem going to be addressed?

cecilia-to commented 6 years ago

this is the nature of P2P. it depend on what peer you find and your peer depends on other and they come and go. there is an option to specifically add certain peers which I believe will eventually be the case that there will be a pool of relatively stable peers(infura?). The program though knows how to 'pick up', just need to wait and if you see it really stuck, stop and restart and see if it can find better peers. why I don't recommend to let Mist control geth, run it seperately

mjdillon commented 6 years ago

I've never had a problem syncing a geth 1.7.3 on a Debian node, but I ran into the "I just can't get the last 100 blocks in sync" issue on a Ubuntu 16.04 node I was spinning up.

I had been using the PPA version of geth, and I switched to the direct-download release 1.7.3 version of geth and this cleared right up on the same box.

I assume that something in the compilation of the two releases is at fault, but I will note that I moved from running geth from /usr/bin to /usr/local/bin. This machine is otherwise a stock Ubuntu 16.04 server installation.

alvaradojl commented 6 years ago

I'm having the same problem using geth on windows, gets stuck on the last hundred blocks

winteraz commented 6 years ago

I'm on ubuntu and I have the same error. Is the ubuntu version deprecated ?

mjdillon commented 6 years ago

For some reason I got a 2x increase in mgasps when I swapped the pre-compiled binary in for the PPA version on ubuntu. I got another 3x increase in mgasps on an identical system switching to a stock debian 9 server installation, all of the above untuned.

I haven't had time to look into CLANG vs GCC compiling from source, and probably won't. It's interesting people on windows have this problem as well, I was suspecting it had something to do with what versions of libraries are on Ubuntu vs. Debian.

On Thu, Jan 18, 2018 at 11:33 AM, winteraz notifications@github.com wrote:

I'm on ubuntu and I have the same error. Is the ubuntu version deprecated ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ethereum/go-ethereum/issues/15001#issuecomment-358702563, or mute the thread https://github.com/notifications/unsubscribe-auth/ABEe1ZrJf9kl7m3ePL_g1e-t_DjNBSKhks5tL3JpgaJpZM4O7ctW .

allquantor commented 6 years ago

Same issue running last stable docker (1.7.3)

matYang commented 6 years ago

Same issue running 1.7.3 on Ubuntu 16, 4G Mem, 100G Harddrive

I have restarted quite a few times and every time it gets stuck at a different place, this is really making me loose faith in Ethereum

ZxMYS commented 6 years ago

Same here. Turning it off and on solves it. BUT IF I CAN SOLVE IT BY TURNING IT OFF AND ON WHY CANT THE PROGRAM DETECT NOTHING IS BEING DONE AND TURN ITS SYNC OFF AND ON AGAIN???

mattotodd commented 6 years ago

Can not sync to mainnet

I have tried both --fast and full sync on Mac 10.10 and Ubuntu 16.04. Always gets to within 100-200 blocks of current block height (within about 4-6 hours of start of sync), but never finishes, even days later. Restarting geth helps kick off the program again to move ahead a few blocks, but eventually it just starts spitting out Imported new state entries and stops progressing.

{
  currentBlock: 5008772,
  highestBlock: 5008970,
  knownStates: 1413559,
  pulledStates: 1399554,
  startingBlock: 5008691
}

Syncs fine on rinkeby testnet

I'm happy to help diagnose the issue or provide info, but at this point I don't know where to look. Is this expected?

aleksey-makarov commented 6 years ago

Same issue 1.7.3-stable-4bb3c89d geth --cache=2048

karalabe commented 6 years ago

That's not an issue. That is how it works. The state trie currently is 70+M entries. You need to wait it out until all of those are downloaded.

michaelx2018 commented 6 years ago

How many entries are for today? More than 70M ? Is there any place where anybody can check it?

contang0 commented 6 years ago

@karalabe , how can I wait it out if import of every chain segment takes between 10s and 2min? Blocks are created faster than geth is syncing.

alejandro-amo commented 6 years ago

Stop saying "this is how it works", please read the data carefully. We are clearly stating that sync speed is slower than block count increase. We will NEVER catch up. You cannot say "this is how it works" in this case, for obvious reasons...

austorms commented 6 years ago

Yea, it's broken. Is anyone able to sync to current block height from scratch?

karalabe commented 6 years ago

Syncing Ethereum is a pain point for many people, so I'll try to detail what's happening behind the scenes so there might be a bit less confusion.

The current default mode of sync for Geth is called fast sync. Instead of starting from the genesis block and reprocessing all the transactions that ever occurred (which could take weeks), fast sync downloads the blocks, and only verifies the associated proof-of-works. Downloading all the blocks is a straightforward and fast procedure and will relatively quickly reassemble the entire chain.

Many people falsely assume that because they have the blocks, they are in sync. Unfortunately this is not the case, since no transaction was executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). These need to be downloaded separately and cross checked with the latest blocks. This phase is called the state trie download and it actually runs concurrently with the block downloads; alas it take a lot longer nowadays than downloading the blocks.

So, what's the state trie? In the Ethereum mainnet, there are a ton of accounts already, which track the balance, nonce, etc of each user/contract. The accounts themselves are however insufficient to run a node, they need to be cryptographically linked to each block so that nodes can actually verify that the account's are not tampered with. This cryptographic linking is done by creating a tree data structure above the accounts, each level aggregating the layer below it into an ever smaller layer, until you reach the single root. This gigantic data structure containing all the accounts and the intermediate cryptographic proofs is called the state trie.

Ok, so why does this pose a problem? This trie data structure is an intricate interlink of hundreds of millions of tiny cryptographic proofs (trie nodes). To truly have a synchronized node, you need to download all the account data, as well as all the tiny cryptographic proofs to verify that noone in the network is trying to cheat you. This itself is already a crazy number of data items. The part where it gets even messier is that this data is constantly morphing: at every block (15s), about 1000 nodes are deleted from this trie and about 2000 new ones are added. This means your node needs to synchronize a dataset that is changing 200 times per second. The worst part is that while you are synchronizing, the network is moving forward, and state that you begun to download might disappear while you're downloading, so your node needs to constantly follow the network while trying to gather all the recent data. But until you actually do gather all the data, your local node is not usable since it cannot cryptographically prove anything about any accounts.

If you see that you are 64 blocks behind mainnet, you aren't yet synchronized, not even close. You are just done with the block download phase and still running the state downloads. You can see this yourself via the seemingly endless Imported state entries [...] stream of logs. You'll need to wait that out too before your node comes truly online.


Q: The node just hangs on importing state enties?!

A: The node doesn't hang, it just doesn't know how large the state trie is in advance so it keeps on going and going and going until it discovers and downloads the entire thing.

The reason is that a block in Ethereum only contains the state root, a single hash of the root node. When the node begins synchronizing, it knows about exactly 1 node and tries to download it. That node, can refer up to 16 new nodes, so in the next step, we'll know about 16 new nodes and try to download those. As we go along the download, most of the nodes will reference new ones that we didn't know about until then. This is why you might be tempted to think it's stuck on the same numbers. It is not, rather it's discovering and downloading the trie as it goes along.

Q: I'm stuck at 64 blocks behind mainnet?!

A: As explained above, you are not stuck, just finished with the block download phase, waiting for the state download phase to complete too. This latter phase nowadays take a lot longer than just getting the blocks.

Q: Why does downloading the state take so long, I have good bandwidth?

A: State sync is mostly limited by disk IO, not bandwidth.

The state trie in Ethereum contains hundreds of millions of nodes, most of which take the form of a single hash referencing up to 16 other hashes. This is a horrible way to store data on a disk, because there's almost no structure in it, just random numbers referencing even more random numbers. This makes any underlying database weep, as it cannot optimize storing and looking up the data in any meaningful way.

Not only is storing the data very suboptimal, but due to the 200 modification / second and pruning of past data, we cannot even download it is a properly pre-processed way to make it import faster without the underlying database shuffling it around too much. The end result is that even a fast sync nowadays incurs a huge disk IO cost, which is too much for a mechanical hard drive.

Q: Wait, so I can't run a full node on an HDD?

A: Unfortunately not. Doing a fast sync on an HDD will take more time than you're willing to wait with the current data schema. Even if you do wait it out, an HDD will not be able to keep up with the read/write requirements of transaction processing on mainnet.

You however should be able to run a light client on an HDD with minimal impact on system resources. If you wish to run a full node however, an SSD is your only option.

ivica7 commented 6 years ago

@karalabe I thought, I would fully understand the fast sync mode, but now I am a little bit confused. I was thinking that fast sync will pick highest node known - 64 and start fetching the state for this block only, while in parallel downloading all the headers/receipts until this block. As you explained, header/receipt download will finish significantly faster than importing the states, so at the end of the fast sync part of the process, we'll see a ton of "importing state entries" messages in the log, before the node switches to the full sync.

Now, what I do not understand: why is data morphing along the way? Is he trying to reconstruct the trie always based on the stateHashRoot known from the most recent header downloaded?

Is the algorithm like:

I thought it would be more straight-forward (without morphing state-trie), like:

EDIT: Moreover, I am able to fast sync the BC on a machine with HDD. Tried it yesterday with geth 1.7.2. The only problem that I had are some deadlocks in acquiring a semaphore. I had to restart some 10-20 times as some people described above. It took in sum ~1day to finish the fast sync part and switich to full sync. From there on I had slow "import new chain segment", maybe ~2 block/15s in average. But still possible to catch up.

EDIT2: ok, I just noticed that my mental model doesn't make sense, since you can not know if P is ok if you don't have all the block headers before it. So geth is doing state download based on the stateroot hash of the most recent header it has verified so far, right? And beacause it's like that, it has to cope with the morphing trie. In this case, wouldn't if be better to serialize the algorithm in order to reduce it's complexity. If you first focus to download the headers as fast as possible and than start to fetch the states for only one single block. You will not have to reorganize the trie all the time. It could be faster than what we have now?

It would be:

the-artifabrian commented 6 years ago

@karalabe Thank you so much for the detailed explanation!

sleimana commented 6 years ago

@karalabe Appreciate your time, and thank you for the detailed answer. Everytime geth crashes the states reset to zero. Will they continue download from the point they stopped or they are forgotten and they start from the begining again? How do I prevent geth process from being killed. I run Ubuntu 16.04 / 3GB RAM/ SSD and I configured geth as following in the supervisor configuration: command=/usr/bin/geth --syncmode "fast" --cache=2024 --rpc --rpcaddr 127.0.0.1 rpcport=8545 --rpcapi "web3,eth,debug" --maxpeers "128" autostart=true autorestart=true

EDIT: Now I know the state tries will be forgotton after geth restarts, so I lowered the cache to 768 to avoid being killed by the system, and it seems to work (Synced ~18M in 12 hours, guess will take about 3 days to finish)

xiaoyao1991 commented 6 years ago

@ivica7 I think currently it IS downloading the state only at latest block according to https://github.com/ethereum/go-ethereum/blob/master/eth/downloader/downloader.go#L1366 (and you'll see the latest param passed in processFastSyncContent is the latest block known when sync begins). It did cancel the state downloading process along the way when it finds the state root to download is getting too much behind the new latest.

stricq commented 6 years ago

If you download the full blockchain onto 8 HDDs with a Simple (Striped) Partition (Storage Spaces) it is fast enough to keep up with the speed demand.

Mergathal commented 6 years ago

Honestly, this is not a disk i/o issues. Too many users complaining of this issue are using SSD's. I am running into this as well with an NVME SSD with over 200GB's free on the drive. Disc usage tops out at 7% with Geth running.

I have restarted many times, cleared the blockchain data and started fresh, have 175mbps down on my internet. So the issue doesn't appear to be on my end.

However, i constantly see things like this related to peers: WARN [04-14|18:05:35] Node data write error err="state node e2a085…302335 failed with all peers (3 tries, 3 peers)" WARN [04-14|18:05:35] Synchronisation failed, retrying err="state node e2a085…302335 failed with all peers (3 tries, 3 peers)" WARN [04-14|18:05:43] Synchronisation failed, dropping peer peer=74750dbb2b166634 err="action from bad peer ignored" INFO [04-14|18:07:18] Imported new block headers count=1 elapsed=4.009ms number=5442163 hash=ec78a0…4c2272 ignored=0 INFO [04-14|18:07:48] Imported new block headers count=1 elapsed=3.999ms number=5442164 hash=cd8840…75a192 ignored=0 WARN [04-14|18:07:51] Node data write error err="state node be2a05…8dfaa0 failed with all peers (2 tries, 2 peers)" WARN [04-14|18:07:51] Rolled back headers count=60 header=5442164->5442104 fast=5442096->5442096 block=0->0 WARN [04-14|18:07:51] Synchronisation failed, retrying err="state node be2a05…8dfaa0 failed with all peers (2 tries, 2 peers)" INFO [04-14|18:07:54] Imported new block headers count=0 elapsed=910.8µs number=5442164 hash=cd8840…75a192 ignored=68 INFO [04-14|18:08:04] Imported new block headers count=1 elapsed=3.927ms number=5442165 hash=1c0405…2540b9 ignored=0

Etherscan shows Last Block = 5442168 (14.3s)

When I check: eth.syncing { currentBlock: 5442096, highestBlock: 5442167, knownStates: 9130491, pulledStates: 9125287, startingBlock: 5442004

My highest block is only one behind and I am less than 100 blocks behind.

It looks like a peering issue and them failing on a regular basis.

garyng2000 commented 6 years ago

it is a peering issue, based on your data, still a long way to go. the # of state is in the range of at least 50M.

wtfiwtz commented 6 years ago

@Mergathal have a look at this blog write-up: http://www.freekpaans.nl/2018/04/anatomy-of-a-geth-full-sync/ http://www.freekpaans.nl/2018/04/anatomy-geth-fast-sync/

It is not entirely due to disk I/O, for sure... there are inherent delays in syncing, due to how many peers, and the information you are waiting on from those peers.

If you are waiting on a peer to respond, you might need to reach a timeout before continuing with parts of the block sync.

Also, if the most recent data is not being shared quickly between all peers in the network, then parts of the network are starved of the latest blocks, and there will be a lag in the syncing process.

This info comes from here: https://github.com/ethereum/go-ethereum/issues/14647#issuecomment-378141722

wtfiwtz commented 6 years ago

Also @Mergathal, don't restart geth as it will need to restart the validation of the chain structures from the beginning (... i'm on 1.7.2 on my MacBook Pro, but probably still an issue with the latest v1.8.3).

Mergathal commented 6 years ago

This sections right here strikes me as a big part of this issue.

"What also surprises me is that all the Ethereum data is already larger than the entire Bitcoin data directory (about 200GB), while Bitcoin is almost 3 times older than Ethereum. Clearly, Ethereum grows much faster than Bitcoin. I guess that it’ll become even harder to do full syncs in the future, and that will probably mean the number of full nodes will decrease. That can’t be good."

My thought is that we are starting to connect to a lot more peers that are starting to struggle and fail to sync fully and this is causing us to stay in a perpetual state of out of sync. If these peers never fully sync themselves, we will never catch up either unless we are syncing with only good peers who are fully synced.

I am running the latest 1.8.3 and yes, it is still an issue. I have left mine alone for days and it still is no closer to syncing fully.

wtfiwtz commented 6 years ago

Actually the latest 1.8.2 and above should save its current sync state: https://github.com/ethereum/go-ethereum/pull/16224

wtfiwtz commented 6 years ago

@Mergathal that is the nature of a blockchain-based approach. Since BitCoin only has blocks targeting every 10 minutes, the throughput is lower and the number of blocks is lower.

Ethereum generates a new block every 30-60 seconds, allowing more transactions and faster response times. There will naturally be more data generated due to this approach. The data would need to be pruned somehow to keep it at a reasonable level.

Interestingly, in http://www.freekpaans.nl/2018/04/anatomy-geth-fast-sync/, it only took 77Gb of data in the blockchain stored locally for a completed fast sync. I've routinely destroyed fast syncs with much more data than that (... I have limited space on MacBook Pro). It seems to me that the longer that you are pulling down the state tries, the more data that is stored locally. It may also depend on how long you are "full syncing" for as well, once the fast sync is complete. I'm yet to fully understand why but it's an interesting observation.

garyng2000 commented 6 years ago

we constantly 'refresh' by fast sync from scratch to keep the size in check. An initial fast sync is only around 60G(as of may be a month ago) then the size grow. after one month we are seeing 140G. Not sure if it is because older state needs to be pulled in or what. Does anyone with 'true' full sync knows the current disk size ?

wtfiwtz commented 6 years ago

@garyng2000 a full sync took 220Gb according to the articles linked above. So it would be approximately 80Gb a month as a "fast sync" switches to a "full sync".

garyng2000 commented 6 years ago

@wtfiwtz that is something puzzle me, if it is 80GB a month we are talking TB data soon but how come a 'true' full sync is only 220G ? If that is the case, may be I should do a true full sync(from scratch) that can take a bit of time but the disk growth rate would be slower ? strange.

wtfiwtz commented 6 years ago

@garyng2000 it could be because the accumulated state is bigger as you participate in the immediate verification of the transactions, where as post-verification is not as much information to download from peers. However, you would need someone more knowledgeable about Ethereum's inner workings to confirm or deny that.

CryptoKiddies commented 6 years ago

I'm on geth v1.8.4 and Ubuntu 16.04. Not only is geth stopping before final sync, but it completely stalls around 30-60 minutes after starting a sync. The CPU usage drops to ~3% of capacity and stays there.

screen shot 2018-04-19 at 6 10 37 pm

I see continuous error messages for connecting to nodes, and the state and blocks completely stop updating. I have to restart geth (I use systemd restart). This is very concerning because I don't want my node to stall in the middle of serving our dapp.

suspended commented 6 years ago

@GeeeCoin you might want to try v1.8.3 - have a simular issue to yours when I moved from .3 to .4

CryptoKiddies commented 6 years ago

@suspended v1.8.6 has the same unresolved issue. **downgrading to geth v1.8.3 worked for about 3 weeks, but now facing the same issues

mtj151 commented 6 years ago

I am also having the same sync problems... dropping peers etc. I am almost synced (about 50-100 blocks behind if I let it run). If I restart geth it catches up until peers start to drop again.

Using Ubuntu 16.04. I have tried different versions of Geth down to 1.8.2. Built the dev version too with no change.

I have lots of experience running a node having done it since the start... but I did re-download the block chain a month or 2 ago.

I use a SATA 500GB SSD but it is encrypted on the drive level and the home directory which is where the blockchain is stored. The encryption means that the read/write abilities are slower and using a disk monitor it shows a high level of activity constantly while geth is running.

I understand storing/using the blockchain on encrypted drive is probably not the best setup (for speed and amount of read writes/life of SSD) so I'm guessing the next thing I should try is a new separate un-encrypted SSD to store the chain... but I have not got round to doing so yet (having another SSD purely for eth blockchain is fairly expensive option). Currently my chaindata folder is 358.8GB

Looks like Ubuntu 16.04 is a consistent part of this thread/problem?

CryptoKiddies commented 6 years ago

@mtj151 good observation. I'm not ruling out any factors at this point. Is anyone using AWS by any chance?

mtj151 commented 6 years ago

I have also noticed that I am unable to send transactions while I am getting the "Synchronisation failed, retrying err="block download cancelled (requested)"" warnings.

I sent one transaction fine but then the warnings come up and it wouldn't let me send another transaction (even after the messages stopped and syncing started again). I had to completely restart geth to be able to send the transaction.

ghost commented 6 years ago

@GeeeCoin I was unable to get a Geth node to stay up to date with chaintip on AWS in any meaningful time without using Provisioned IOPS SSDs on EBS-optimized instances or the i3 storage-optimized instances with 8GB RAM or greater. Even then, I had to write a watchdog to kick geth over every now and then for when it would drop all its peers or lag too much behind the chaintip. Now I just have dedicated boxes for geth nodes running NVMe SSD in the datacenter, and a NUC for LAN dev which has a 1 TB SATA SSD, 8GB RAM and a quad-core processor.

CryptoKiddies commented 6 years ago

@10a7 appreciate the data point. If NUC is outperforming a quad core with 8GB in AWS, that's a problem. Amazon may have network latency that hasn't been optimized with the t. class. The i3 looks like an option. We're taking a look at Quarian; thanks for building that out!

mtj151 commented 6 years ago

Sounds like 10a7 had the same problem with lagging behind the chain tip... good description of the problem. Did NVMe SSD fix the problem?? I'm looking at getting one in the coming weeks to run geth.

ghost commented 6 years ago

@mtj151 NVMe SSD doesn't seem to matter. I have no trouble keeping SATA SSDs and bcache-fronted magnetic arrays intact and synced I/O wise.

If you are synced and "importing new chain segment", it seems to mostly be network issues that cause my nodes to fall behind. Restarting geth often helps to get different peers. Geth sync-after-fast-pivot is also much more reliable for me if I am not behind a NAT, and can forward/open 30303/tcp.

jdowning100 commented 6 years ago

FWIW I was able to get geth to fully sync by waiting until eth.blockNumber is near the numbers in eth.syncing and then restarting geth. I was able to do this at ~160m states. After restarting geth, it took about 20 min to catch up to the blockchain and now eth.syncing is false and the only output now is 'imported new chain segment' every time a new block is found.

karalabe commented 6 years ago

@ Syncing Ethereum is a pain point for many people, so I'll try to detail what's happening behind the scenes so there might be a bit less confusion.

The current default mode of sync for Geth is called fast sync. Instead of starting from the genesis block and reprocessing all the transactions that ever occurred (which could take weeks), fast sync downloads the blocks, and only verifies the associated proof-of-works. Downloading all the blocks is a straightforward and fast procedure and will relatively quickly reassemble the entire chain.

Many people falsely assume that because they have the blocks, they are in sync. Unfortunately this is not the case, since no transaction was executed, so we do not have any account state available (ie. balances, nonces, smart contract code and data). These need to be downloaded separately and cross checked with the latest blocks. This phase is called the state trie download and it actually runs concurrently with the block downloads; alas it take a lot longer nowadays than downloading the blocks.

So, what's the state trie? In the Ethereum mainnet, there are a ton of accounts already, which track the balance, nonce, etc of each user/contract. The accounts themselves are however insufficient to run a node, they need to be cryptographically linked to each block so that nodes can actually verify that the account's are not tampered with. This cryptographic linking is done by creating a tree data structure above the accounts, each level aggregating the layer below it into an ever smaller layer, until you reach the single root. This gigantic data structure containing all the accounts and the intermediate cryptographic proofs is called the state trie.

Ok, so why does this pose a problem? This trie data structure is an intricate interlink of hundreds of millions of tiny cryptographic proofs (trie nodes). To truly have a synchronized node, you need to download all the account data, as well as all the tiny cryptographic proofs to verify that noone in the network is trying to cheat you. This itself is already a crazy number of data items. The part where it gets even messier is that this data is constantly morphing: at every block (15s), about 1000 nodes are deleted from this trie and about 2000 new ones are added. This means your node needs to synchronize a dataset that is changing 200 times per second. The worst part is that while you are synchronizing, the network is moving forward, and state that you begun to download might disappear while you're downloading, so your node needs to constantly follow the network while trying to gather all the recent data. But until you actually do gather all the data, your local node is not usable since it cannot cryptographically prove anything about any accounts.

If you see that you are 64 blocks behind mainnet, you aren't yet synchronized, not even close. You are just done with the block download phase and still running the state downloads. You can see this yourself via the seemingly endless Imported state entries [...] stream of logs. You'll need to wait that out too before your node comes truly online.


Q: The node just hangs on importing state enties?!

A: The node doesn't hang, it just doesn't know how large the state trie is in advance so it keeps on going and going and going until it discovers and downloads the entire thing.

The reason is that a block in Ethereum only contains the state root, a single hash of the root node. When the node begins synchronizing, it knows about exactly 1 node and tries to download it. That node, can refer up to 16 new nodes, so in the next step, we'll know about 16 new nodes and try to download those. As we go along the download, most of the nodes will reference new ones that we didn't know about until then. This is why you might be tempted to think it's stuck on the same numbers. It is not, rather it's discovering and downloading the trie as it goes along.

Q: I'm stuck at 64 blocks behind mainnet?!

A: As explained above, you are not stuck, just finished with the block download phase, waiting for the state download phase to complete too. This latter phase nowadays take a lot longer than just getting the blocks.

Q: Why does downloading the state take so long, I have good bandwidth?

A: State sync is mostly limited by disk IO, not bandwidth.

The state trie in Ethereum contains hundreds of millions of nodes, most of which take the form of a single hash referencing up to 16 other hashes. This is a horrible way to store data on a disk, because there's almost no structure in it, just random numbers referencing even more random numbers. This makes any underlying database weep, as it cannot optimize storing and looking up the data in any meaningful way.

Not only is storing the data very suboptimal, but due to the 200 modification / second and pruning of past data, we cannot even download it is a properly pre-processed way to make it import faster without the underlying database shuffling it around too much. The end result is that even a fast sync nowadays incurs a huge disk IO cost, which is too much for a mechanical hard drive.

Q: Wait, so I can't run a full node on an HDD?

A: Unfortunately not. Doing a fast sync on an HDD will take more time than you're willing to wait with the current data schema. Even if you do wait it out, an HDD will not be able to keep up with the read/write requirements of transaction processing on mainnet.

You however should be able to run a light client on an HDD with minimal impact on system resources. If you wish to run a full node however, an SSD is your only option.

CryptoKiddies commented 6 years ago

@karalabe Thanks for breaking this down again. We knew most of this about Geth/Eth already, but I'm really surprised as to how suboptimal the state trie system is at being stored to disk; I thought the whole point of building ethereum this way (with modified patricia trees etc.) was to minimize footprint/disk mods, but looks like innovation in storage structures is still needed.

hustnn commented 6 years ago

@karalabe . Nice introduction. Understanding fast sync internal better.