skalenetwork / skale-consensus

Running the very core of SKL network, SKALE BFT consensus is universal, modern, modular, high-performance, asynchronous, provably-secure, agent-based Proof-of-Stake blockchain consensus engine in C++ 17. Includes provably secure embedded Oracle. Used by SKALE elastic blockchains. Easy and flexible enough to implement your own blockchain or smart contract platform. BLS signatures and Binary Asynchronous Consensus are main building blocks.
https://docs.skale.network/technology/consensus-spec
GNU Affero General Public License v3.0
78 stars 32 forks source link

Skaled crashes after 3 hours of stuck #812

Open oleksandrSydorenkoJ opened 11 months ago

oleksandrSydorenkoJ commented 11 months ago

Version skalenetwork/schain:3.17.0-beta.8

Preconditions: Active schain medium type (may be reproduced on 4-nodes chain )

Steps to reproduce

  1. Stop 6 skaled containers (more than 1/3 of total nodes )
  2. wait for 3 hours
  3. restart one of stopped skaled containers and check skaled logs

Expected state After 3 hours of consensus stuck skaled should gracefully stop the processes, set to <skaled - admin> interface restart from the catchup and exit with 0 code Example: Internal exit initiated. Signal -1.

Actual state Sklaed stop themselsf with SIGABRT(6) signal and crashes with stack trace output in logs

Logs: consensus_3_17_0_beta_8_crashes_after_3_hours_suck.log

oleksandrSydorenkoJ commented 11 months ago

Related to https://github.com/skalenetwork/skaled/issues/1425