openethereum / parity-ethereum

The fast, light, and robust client for Ethereum-like networks.
Other
6.82k stars 1.69k forks source link

100% CPU utilization and RPC service doesn’t response #6983

Closed travelnotes closed 6 years ago

travelnotes commented 6 years ago

Before filing a new issue, please provide the following information.

I'm running:

  • Parity version: 1.7.0
  • Operating system: Linux
  • And installed: via installer(deb)

Your issue description goes here below. Try to include actual vs. expected behavior and steps to reproduce the issue.

Backgroud we use parity in our production environment (Linux /1.7beta/via binary installer) with POA consensus engine configured, this Blockchain works well for almost two months and generated about 2.3 m blocks, and we have not created any new txs since 11 Oct(just generate blank blocks).

Issue We found the RPC WS stopped working yesterday, the CPU utilization rate runs to 100% after some RPC call(e g. eth_estimateGas, the RPC call is blocked there). Upgrade to 1.7.8 We tried to upgrade it to the stable version 1.7.8(we installed the new version and copied existed dB file to the new do path ) , it still doesn’t work ,could you help me , thanks.

As it is our Production environment, we have to stop the service until we find a solution.

travelnotes commented 6 years ago

Chain configuration: { "name": "ProductPoA", "engine": { "authorityRound": { "params": { "gasLimitBoundDivisor": "0x400", "stepDuration": "2", "validators" : { "safeContract": "0x0000000000000000000000000000000000000005"

           }
        }
    }
},
"params": {
    "maximumExtraDataSize": "0x20",
    "minGasLimit": "0x1388",
    "networkID" : "0x2323"
},
"genesis": {
    "seal": {
        "authorityRound": {
            "step": "0x0",
            "signature": "0x0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
        }
    },
    "difficulty": "0x20000",
    "gasLimit": "0x12a05f200"
},
"accounts": {
    "0x0000000000000000000000000000000000000001": { "balance": "1", "builtin": { "name": "ecrecover", "pricing": { "linear": { "base": 3000, "word": 0 } } } },
    "0x0000000000000000000000000000000000000002": { "balance": "1", "builtin": { "name": "sha256", "pricing": { "linear": { "base": 60, "word": 12 } } } },
    "0x0000000000000000000000000000000000000003": { "balance": "1", "builtin": { "name": "ripemd160", "pricing": { "linear": { "base": 600, "word": 120 } } } },
    "0x0000000000000000000000000000000000000004": { "balance": "1", "builtin": { "name": "identity", "pricing": { "linear": { "base": 15, "word": 3 } } } },
    "0x0000000000000000000000000000000000000005": { "balance": "1", "constructor" : ****}
}

}

travelnotes commented 6 years ago

Node configuration:

[parity] chain = "contract-product-spec.json" base_path = "./chainData/networks/parity0" [network] reserved_peers = "reservedPeers.enode" reserved_only = true port = 30312 [rpc] interface = "121.42.." port = 8552 apis = ["web3", "eth", "net", "personal", "parity", "parity_set", "traces", "rpc", "parity_accounts"] [websockets] interface = "121.42.." port = 8547 [ui] interface = "121.42.." port = 8192 [account] [mining]

engine_signer = "0x7a95db805a7f65f7475f119ae79568c6f2086f1d"

force_sealing = true reseal_on_txs = "all" reseal_min_period = 2000 reseal_max_period = 30000 usd_per_tx = "0" gas_floor_target = "20000000"

travelnotes commented 6 years ago

It seems there is some thing wrong with the blocks data, we tried to remove some blocks(start a new node but don’t sync the latest blocks), this node works again.so my question is: 1 How to remove the blocks from specified height 2 how could I spot the issue and is there a solution?

travelnotes commented 6 years ago

img_8495

5chdn commented 6 years ago

We tried to upgrade it to the stable version 1.7.8(we installed the new version and copied existed dB file to the new do path ) , it still doesn’t work ,could you help me , thanks.

could you expand on this? what exactly is not working? you don't need to copy the database for upgrading a patch version.

It seems there is some thing wrong with the blocks data, we tried to remove some blocks(start a new node but don’t sync the latest blocks), this node works again.so my question is: 1 How to remove the blocks from specified height

It seems you were able to remove some blocks by cutting down the sync at some point. In what way did it solve your problem? Other than (1) there is no other way to do this.

Are you using the User Interface by the way?

Can you share more complete logs of your RPC traces?

travelnotes commented 6 years ago

1 Upgrade details: I installed the 1.7.8 from it’s binary package, when I start the node with the same configuration file, it created a different dB path not using the existed one, that was why I copied the dB file(the two folder in networks/paritynode/chains/PoA/db/7348aba6abe735d1). It doesn’t work means the performance problem still exists, the nodes Works well , but if I send a RPC call “eth_estimateGas”,the CPU utilization rate run to 100%, and lasts for one minutes and a half, then back to normal. The system call to get block height/details works well.

2 I remove about 1m blank blocks(stop sync at given height) , this issue went away.

3 Another way seems ok:(Without removing any blocks) I created a new tx that sent ethers to an address, it tooks about 2 minutes to finish, after that every thing seems ok, the gas_estimate RPC could response immediately, the CPU usage is fine.

4 I don’t use the UI component, how could I get the full RPC trace

5chdn commented 6 years ago

Run the node with -l rpc=trace :)

travelnotes commented 6 years ago

I started only one node, root@bc01:~/parity1# parity --config node01.toml -l rpc=traceLoading config file from node01.toml 2017-11-05 10:15:23 main INFO parity::run Starting Parity/v1.7.0-beta-5f2cabd-20170727/x86_64-linux-gnu/rustc1.18.0 2017-11-05 10:15:23 main INFO parity::run Keys path ./chainData/networks/parity0/keys/ProductPoA 2017-11-05 10:15:23 main INFO parity::run DB path ./chainData/networks/parity0/chains/ProductPoA/db/7348aba6abe735d1 2017-11-05 10:15:23 main INFO parity::run Path to dapps ./chainData/networks/parity0/dapps 2017-11-05 10:15:23 main INFO parity::run State DB configuration: fast 2017-11-05 10:15:23 main INFO parity::run Operating mode: active 2017-11-05 10:15:24 main INFO ethcore::service Configured for ProductPoA using AuthorityRound engine 2017-11-05 10:15:58 WARN jsonrpc_ipc_server::server Removed existing file './chainData/networks/parity0/jsonrpc.ipc'. 2017-11-05 10:16:00 IO Worker #3 INFO import Imported #2423462 ec86…d9f6 (0 txs, 0.00 Mgas, 124.68 ms, 0.57 KiB) 2017-11-05 10:16:04 IO Worker #2 INFO import Imported #2423463 f456…4c3a (0 txs, 0.00 Mgas, 0.42 ms, 0.57 KiB) 2017-11-05 10:16:04 IO Worker #3 INFO network Public node URL: enode://c3be1501dd4da6266f44a275f61a2ed34e3594dd5d44e17f04ad0aa021520a471e2bf26505199386e9ae452d2d6e631b4ed2ec0ede1e387cb64169e2cef2ebab@10.45.17.42:30311 2017-11-05 10:16:08 IO Worker #0 INFO import Imported #2423464 4480…e5c6 (0 txs, 0.00 Mgas, 0.49 ms, 0.57 KiB) 2017-11-05 10:16:12 IO Worker #1 INFO import Imported #2423465 580a…de0d (0 txs, 0.00 Mgas, 0.47 ms, 0.57 KiB) 2017-11-05 10:16:15 IO Worker #3 INFO import Imported #2423466 007a…710b (0 txs, 0.00 Mgas, 0.46 ms, 0.57 KiB) 2017-11-05 10:16:20 IO Worker #2 INFO import Imported #2423467 0cc5…45eb (0 txs, 0.00 Mgas, 0.42 ms, 0.57 KiB) 2017-11-05 10:16:24 IO Worker #3 INFO import Imported #2423468 2be9…fe80 (0 txs, 0.00 Mgas, 0.46 ms, 0.57 KiB) 2017-11-05 10:16:26 TRACE rpc Request: {"jsonrpc":"2.0","method":"eth_estimateGas","params":[{"from":"0xd70db13e77dd64e7b3d53f2442912c26a68eaa7f","to":"0x757bc081b80d8287a6bedc8167c4a131a42f5179","data":"0x7e454bb000000000000000000000000049db4bd97d0a29e5ef3cda9c5584c907a35d357a"}],"id":1}. 2017-11-05 10:16:26 DEBUG rpc Response: Some("{\"jsonrpc\":\"2.0\",\"result\":\"0xd700\",\"id\":1}"). 2017-11-05 10:16:28 IO Worker #0 INFO import Imported #2423469 e195…41fa (0 txs, 0.00 Mgas, 0.58 ms, 0.57 KiB) 2017-11-05 10:16:32 IO Worker #3 INFO import Imported #2423470 5729…c354 (0 txs, 0.00 Mgas, 0.42 ms, 0.57 KiB) 2017-11-05 10:16:34 IO Worker #0 INFO import 0/25 peers 3 MiB chain 905 MiB db 0 bytes queue 448 bytes sync RPC: 0 conn, 0 req/s, 12921 µs 2017-11-05 10:16:36 IO Worker #2 INFO import Imported #2423471 701b…4bed (0 txs, 0.00 Mgas, 0.42 ms, 0.57 KiB) 2017-11-05 10:16:40 IO Worker #3 INFO import Imported #2423472 b2a8…5872 (0 txs, 0.00 Mgas, 0.35 ms, 0.57 KiB) 2017-11-05 10:16:44 IO Worker #0 INFO import Imported #2423473 ef55…dacc (0 txs, 0.00 Mgas, 0.50 ms, 0.57 KiB

travelnotes commented 6 years ago

Is there any way to configure the block generation strategy? In previous version we can use the strategy that a block is created only when there is a tx submitted, but now I could not do this. From 1.7, it seems we can only use the strategy to create block at specified interval, no matter whether this is any tx, so we should create lot of blank blocks.

5chdn commented 6 years ago

I don't see anything unusual in the logs. Estimate gas immediately returned the result.

w5pand commented 6 years ago

@5chdn "eth_estimateGas" costs more than 60 seconds image

w5pand commented 6 years ago

Then I created a new tx that sent ethers to an address. After that the eth_estimateGas RPC could response immediately, the CPU usage is fine. image

bhok commented 6 years ago

Would be nice to have a feature to pause the block generation when no TX exist, there are some plan to implement this feature?

heyalistair commented 6 years ago

How much gas do your transactions require? I.e. What's the min gas limit per transaction?

I don't know exactly when you started your chain. But you say that it worked on Oct 11th, and then you noticed it stopped working on Nov 4th. It may be nothing, but according to how you have set up your chain, I think it's worth pointing out that your block gas limit starts at five billion, and then slowly decreases until it hits five thousand. That can take days.

Depending on how your transactions work, you could start a new chain from scratch that has a block size of five thousand set right in the genesis block. And then you can test and see if you are getting the behavior you see on your currently defunct chain. That would confirm that you have a problem with your chain-wide block gas limit.

bmatthewshea commented 6 years ago

Parity version: Parity/v1.8.3-beta-b49c44a-20171114/x86_64-linux-gnu/rustc1.21.0 Operating system: Linux / Ubuntu 16.04.03 LTS Install: parity_1.8.3_amd64.deb

v.1.8.3 parity-100percent-cpu-2017-11-29-091951

At first I thought it was the UI - removed that on startup. Still doing it. I realize this is a beta but thought I would write it up here. I am going to try switching to a different version.

Updated/Edit: v1.7.9 Same. What is going on with this? There is no reason a process should ever pull this much cpu. You need to limit cpu use/nice it somehow (new flag?). Even if it only does this at startup (only), it should only throttle cpu maybe +100% for a few secs or mins at a time. It does this constantly. It's making my i7 server sloooow. Also, running Geth as different user - which i never see go above 10% CPU much. Stark contrast.

parity-v179-100percent-cpu-2017-11-29-091951

My startup "(local server ip)" is a LAN address/my server+host of parity:

#!/bin/bash
parity \
--chain classic \
--jsonrpc-apis "eth,net,web3" \
--jsonrpc-interface (local server ip) \
--ui-interface (local server ip) \
--ws-interface (local server ip) \
--ws-origins http://(local server ip) \
--allow-ips=public \
--no-discovery \
--author (coin base addr) \
--stratum \
--stratum-port=9009 \
--stratum-interface=0.0.0.0 \
--log-file /home/user/logs/parity-etc.log

Everything 'working' as advertised. Just the CPU issue is raining on my parade - so, may have to remove parity if no resolution. Can't continue this way.

Final edit?: Calmed down around 2 hours in (using 179 stable). Though, under beta I left running overnight and was still pegging cpu when I awoke.

5chdn commented 6 years ago

@bmatthewshea of course, it has to synchronize the chain.

bmatthewshea commented 6 years ago

@5chdn Well, geth didn't/doesn't peg it like that on same server. Yes, it ran somewhat high during first sync (especially first couple minutes), but not -continuously- over 100% the entire time. And as I said, BETA continued 100%+ load (continuously) long after sync. I finally had to kill it. Just letting thread know. Do what you will with it..

tomusdrw commented 6 years ago

Related: #7075 If estimating gas for a heavy call that always fails it may essentially lock the RPC thread entirely. Please consider running with --jsonrpc-threads 4 to be able to support 4 such calls simultanously, we should also include #7075 in 1.9 and backport it.