Consensys / quorum

A permissioned implementation of Ethereum supporting data privacy
https://www.goquorum.com/
GNU Lesser General Public License v3.0
4.68k stars 1.29k forks source link

Panic with IBFT Regular node #567

Closed marcosio closed 5 years ago

marcosio commented 5 years ago

System information

Geth version: geth version

Geth version: 1.7.2-stable. Quorum version: v2.0.2

OS & Version: Ubuntu Linux

Branch, Commit Hash or Release: git status

commit: df4267a

Related issue:

alastria-node #312

Expected behaviour

The Regular node, must be online.

Actual behaviour

After 10 minutes aprox. the node shutdown with a panic message.

Steps to reproduce the behaviour

Start to syncronize the node and after 10 minutes the nodes shutdown with a panic error.

Backtrace

quorum.log quorum_20181109222201.log

jbhurat commented 5 years ago

Hi @marcosio, I am trying to reproduce the above issue with no luck so far. Can you provide more details on how to the network is setup (number of nodes etc.). Also, I believe the network is running for sometime now and the issue started happening when one of the nodes was restarted. Is my understanding correct. Also, can you please lay down the steps to reproduce the issue.

jbhurat commented 5 years ago

Hi @marcosio, do you still see this issue. If yes, can you please provide additional details on how to reproduce it.

fixanoid commented 5 years ago

Please reopen when details are available. Thank you.

marcosio commented 5 years ago

Testing with the version of Quorum 2.1.0 on the same machine, the same problem is reproduced on that node.

We send the memory profile. 20181203_1847.zip

jbhurat commented 5 years ago

Hi @marcosio as mentioned in the above comment, can you please provide additional details around how the network is setup (number of nodes, consensus used etc).

Can you also provide geth arguments used and attach the logs

marcosio commented 5 years ago

Hi!,

At this moment, our network has about 50 nodes, IBFT as a consensus protocol and this particular node is configured as a regular node.

We start the regular nodes with this expression:

geth --datadir /home/ubuntu/alastria/data --networkid 82584648528 --identity REG_Company_Arrakis_4_0_0 --permissioned --rpc --rpcaddr 0.0.0.0 --rpcapi admin,db,eth,debug,miner,net,shh,txpool,personal,web3,quorum,istanbul --rpcport 22000 --port 21000 --istanbul.requesttimeout 10000 --ethstats REG_Trylobyte_Arrakis_4_0_0:bb98a0b6442386d0cdf8a31b267892c1@netstats.testnet.alastria.io.builders:80 --verbosity 3 --vmdebug --emitcheckpoints --targetgaslimit 18446744073709551615 --syncmode fast --vmodule consensus/istanbul/core/core.go=5 --nodiscover

jbhurat commented 5 years ago

How much memory is available on the host where this geth instance is running. Can you try increasing the memory and check if it fixes the issue

jbhurat commented 5 years ago

Can you also try to remove personal rpcapi from the geth arguments and try and see if it fixes the issue

marcosio commented 5 years ago

In the node, we have 4Gb of main memory and 5Gb of swap memory.

And, we retrieve the personal RPC API from the node, but the chain does not synchronize. quorum_20181210143816.log

jbhurat commented 5 years ago

@marcosio I looked at the logs and looks like the block headers import was rolled back. Did any of the peers disconnected/stopped during this process. Also, do you see the same issue if you retry.

jbhurat commented 5 years ago

@marcosio do you still see this issue?

jbhurat commented 5 years ago

Hi @marcosio I am closing this issue for now. Please reopen if you need any further assistance.