Open kenricnelson opened 4 years ago
PS: I submitted this issue to input-output-hk/jormungandr as well; so hopefully we will get some feedback from the Cardano community.
Work around: By closing the terminal window which started jormungandr or even the full terminal application, the jormungandr continues to run and stay in sync. Any dependency on the ssh terminal connection is severed, so this should not cause the node to unsync.
Describe the bug Jormungandr node stops incrementing the blockheight when the SSH terminal to AWS is restarted. The node continues to run, but is out of sync with the generation of new blocks.
Mandatory Information
jcli --v0.8.3
jormungandr --v0.8.3 Macbook Pro laptop running Bash terminal with ssh log in to AWS AWS Instance: t2.microTo Reproduce Steps to reproduce the behavior:
Start jormungandr with the following command designed to run in the background with logging sent to a file: nohup ./jormungandr --config itn_rewards_v1-config.yaml --genesis-block-hash 8e4d2a343f3dcf9330ad9035b3e8d168e6728904262f2c434a4f8f934ec7b676 --secret ./node_secret.yaml > /data/logs/jorm_log_date_UTC.txt 2>&1 &
Allow to run for several hours with Mac in sleep mode (sleep does not shut off background computation). Upon reawaking Mac, the ssh terminal may be frozen, in which case ...
output:
After 9 hours of running, sought to examine status; however, the ssh terminal was frozen and then reinitialized the local bash terminal with message: packet_write_wait: Connection to 3.20.11.61 port 22: Broken pipe.
Upon ssh log in to AWS instance the message included: System restart required Last login: Thu Jan 2 02:12:54 2020 from 146.115.188.76
jcli status report shows: command: ./jcli rest v0 node stats get --host "http://127.0.0.1:8443/api" output: blockRecvCnt: 1249 lastBlockContentSize: 315 lastBlockDate: "19.15244" lastBlockFees: 800000 lastBlockHash: e1ad6e38eb3a8104e3eb0fe5a1b77733debb6a605fb0378524d728f2bdf31c6f lastBlockHeight: "63198" lastBlockSum: 112886597000 lastBlockTime: "2020-01-02T11:40:41+00:00" lastBlockTx: 1 state: Running txRecvCnt: 978 uptime: 33581 version: jormungandr 0.8.3-8f276c0
The lastBlockTime of 11:40:41 was just ~2 minutes prior to the restart of the ssh terminal. And lastBlockHeight 63198 was close to the currently reported blockheight on pooltool.io.
Conclusion: Node was syncing to new blocks for 9 hours. When the SSH terminal was restarted, the jormungandr node was still running but the restart must have caused the synchronization to new blocks to stop.
Expected behavior I expect jormungandr to be able to continuously sync to new blocks while running in background mode initiated from ssh terminal.