openethereum / parity-ethereum

The fast, light, and robust client for Ethereum-like networks.
Other
6.83k stars 1.69k forks source link

parity 2.72 stuck between 242- 272W at 0.0 blk/s #11642

Open jiangxjcn opened 4 years ago

jiangxjcn commented 4 years ago

memory: 64G memory(I have a 1TB memory but only allocate 64GB for parity) cpu: 12 core disk: 6TB HDD(I know without using ssd will make the sync slow, but 0.0 blk/s for a long time is not reasonable, and i did not find parity have a high i/o load with pidstat)

my config.toml: mode = "active" base_path = "/xxx/xxx/parity" port = 30304 warp = false min_peers = 50 max_peers = 100 port = 9999 server_threads = 2 tracing = "on" pruning = "fast" pruning_history = 50000 cache_size = 64000 scale_verifiers = true

expected behavior: sync at the speed of around 0.1 blk/s rather than always 0.0 blk/s (I know ddos attack around 240W height and it need a long time to sync these block)

actual behavior: when i restart parity, it will start to sync normally for around 2 hours at a speed of 0.1-0.2 blk/s. but after that, parity seems to stuck. parity only sync around 60 block the whole night.

2020-04-21 11:31:35 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 57 req/s, 34 µs 2020-04-21 11:31:41 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 48 req/s, 34 µs 2020-04-21 11:31:53 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 54 req/s, 34 µs 2020-04-21 11:31:59 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 36 req/s, 34 µs 2020-04-21 11:31:59 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 36 req/s, 34 µs 2020-04-21 11:32:05 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 58 req/s, 34 µs 2020-04-21 11:32:17 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 0 req/s, 34 µs 2020-04-21 11:32:23 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 55 req/s, 34 µs 2020-04-21 11:32:23 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 55 req/s, 34 µs 2020-04-21 11:32:29 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 37 req/s, 34 µs 2020-04-21 11:32:41 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 50 req/s, 34 µs 2020-04-21 11:32:48 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 38/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 0 req/s, 34 µs 2020-04-21 11:32:48 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 37/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 0 req/s, 34 µs 2020-04-21 11:32:54 Syncing #2430865 0x2a64…bd7e 0.00 blk/s 0.0 tx/s 0.0 Mgas/s 0+10141 Qed #2441006 37/50 peers 457 KiB chain 21 GiB db 56 MiB queue 90 MiB sync RPC: 0 conn, 67 req/s, 34 µs

jiangxjcn commented 4 years ago

And another question is that why there are so many RPC requsets. I just request blockheight per 2 second.

adria0 commented 4 years ago

And another question is that why there are so many RPC requsets.

Maybe there's another DOS there. Are you protecting the RPC port to avoid unauthorized connections, @jiangxjcn?

jiangxjcn commented 4 years ago

And another question is that why there are so many RPC requsets.

Maybe there's another DOS there. Are you protecting the RPC port to avoid unauthorized connections, @jiangxjcn?

No, i did not do anything to protect the rpc port.

jiangxjcn commented 4 years ago

I use a python script to request block_height every 2 second with code as following:

url = "http://127.0.0.1:9999" head = {'Content-type':'application/json'} data = '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":67}' response = requests.post(url,headers=head,data=data)

I think maybe this is one of the possible reason for parity stuck.

jiangxjcn commented 4 years ago

I have set rpc interface = “local” The rpc requirement seems to be normal. But parity still stuck. I restart it and it will sync normally for a short time with no more than 500 new blocks every restart. I wander whether this is determined by those peers i connected.

jiangxjcn commented 4 years ago

@adria0 Has some other suggestion?

adria0 commented 4 years ago

why there are so many RPC requsets

@jiangxjcn, when you said this, what requests are you referring?

jiangxjcn commented 4 years ago

@adria0 I initially thought of the requests as my own request for getting sync height. But it's obviously not because i request block height through json-rpc every 2 second while parity log show around 300 requests per second. After that, i set rpc interface = local and the request number in parity log seems to be normal. However, the main problem for me now is parity often stuck. The blk/s is always 0. It now sync only about 10 block every 30 minute.

jiangxjcn commented 4 years ago

@adria0 And every time i restart parity, it will sync new block at the speed of 0.1-0.4 blk/s. I think this is normal. But not long after every restart, the speed down to 0.

Another importmant information is that I/O load is low.

adria0 commented 4 years ago

I'm curious if the know ddos attack around 240W that you said, had some kind of impact.

Could you provide more information about the attack, please?

jiangxjcn commented 4 years ago

I'm curious if the know ddos attack around 240W that you said, had some kind of impact.

Could you provide more information about the attack, please?

description about ddis attack around 240W height https://ethereum.stackexchange.com/questions/9883/why-is-my-node-synchronization-stuck-extremely-slow-at-block-2-306-843

The most likely reason is i could come up with now is the parity version problem. For example, https://github.com/openethereum/openethereum/issues/11494 Many people who use the latest 2.7.2 parity meet the similiar problem. I will try to resync the ethereum with parity 2.5.13. If it works correctly, i will come back to report.

jiangxjcn commented 4 years ago

@adria0 In line with expectations, using parity 2.5.13 is really good. It takes only about 3.5 hours to catch up with parity 2.7.2 which is stuck at around 2433000 block. And now the speed is around 50-100 blk/s, much higher than I expected. I will monitor the sync process for the following several days.

dvdplm commented 4 years ago

Many people who use the latest 2.7.2 parity meet the similiar problem.

2.7.2 makes changes to the database layout on disk that can cause significant reshuffling of data to happen. This can take a long while to sort out but what we've seen from previous reports is that the dip in performance usually fixes it self after a while, sometimes after several days. If you have a chance to do so, please consider running your 2.7.2 node for a week or so and see if it looks better after that.

jiangxjcn commented 4 years ago

Many people who use the latest 2.7.2 parity meet the similiar problem.

2.7.2 makes changes to the database layout on disk that can cause significant reshuffling of data to happen. This can take a long while to sort out but what we've seen from previous reports is that the dip in performance usually fixes it self after a while, sometimes after several days. If you have a chance to do so, please consider running your 2.7.2 node for a week or so and see if it looks better after that.

Thanks for your advie, but i don't have so much time now . When i have enough time , i will have a try. Another thing to mention is that it seems that there are too many peers disconnect and connect when syncing with parity 2.7.2.

nysxah commented 4 years ago

experiencing similar issue, very slow sync after upgrading to 2.7.2 from 2.5.13.

a 2.7.2 node fell behind on blocks, and couldn’t catch up to tip over several hours; at the rate it was syncing, it would likely take days.

2.5.13 did not have this issue. if it fell behind, it would catch up ~150 blocks within a couple of minutes.

edit: at this point all nodes which were upgraded to 2.7.2 are behind; new blocks are produced quicker than it is syncing. reverting to 2.5.13 as the latest version is unusable.