Closed shufps closed 1 year ago
In the logs we seed, that the actual syncing works as expected in Hornet and TenderCoo. Only a few milestones after we run into the issue that another validator has proposed a milestone parent which is not solid for us. The parent was issued while we were offline and it is as such only requested when the milestone is created/received.
In the Hornet log we see that this requesting took about 4s and thus significantly longer than the configured request timeout of 2s. Even in local tests with only one missing parent I always saw longer request times than 2s (about 3s - 4s). If this is not a bug but the expected time it takes, I strongly recommend to increase the whiteFlagParentsSolidTimeout
default to 5s or 10s in the TenderCoo as well as Hornet.
retried:
inx-tendercoo | 2023-02-22T13:56:44Z INFO INX Connecting to node and reading node configuration ...
inx-tendercoo | 2023-02-22T13:56:44Z INFO INX > retrying INX connection to node ...
inx-tendercoo | 2023-02-22T13:56:45Z INFO INX > retrying INX connection to node ...
inx-tendercoo | 2023-02-22T13:56:46Z INFO INX > retrying INX connection to node ...
inx-tendercoo | 2023-02-22T13:56:47Z INFO INX Reading node status ...
inx-tendercoo | 2023-02-22T13:56:47Z INFO Coordinator Providing Coordinator ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO Coordinator Providing Coordinator ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Loading core components ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Loading core components: INX ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Loading core components: Coordinator ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Loading core components: Shutdown ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Loading plugins ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Loading plugin: Profiling ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Executing core components ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Starting core component: INX ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Starting core component: Coordinator ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Starting core component: Shutdown ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Executing plugins ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Starting plugin: Profiling ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO App Starting background workers ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO Coordinator Starting Decentralized Coordinator ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO INX Starting NodeBridge ...
inx-tendercoo | 2023-02-22T13:56:48Z INFO Profiling You can now access the profiling server using: http://0.0.0.0:6060/debug/pprof/
inx-tendercoo | 2023-02-22T13:56:48Z INFO Coordinator Starting TangleListener ... done
inx-tendercoo | 2023-02-22T13:56:48Z INFO Coordinator Found private validator {"keyFile": "tendermint/config/priv_validator_key.json", "stateFile": "tendermint/data/priv_validator_state.json"}
inx-tendercoo | 2023-02-22T13:56:48Z INFO Coordinator Found node key {"path": "tendermint/config/node_key.json"}
inx-tendercoo | 2023-02-22T13:56:48Z INFO Coordinator Found genesis file {"path": "tendermint/config/genesis.json"}
inx-tendercoo | 2023-02-22T13:56:48Z INFO Coordinator Node appears to be connected
inx-tendercoo | 2023-02-22T13:56:48Z WARN Coordinator node is not synced; retrying in 2s
inx-tendercoo | 2023-02-22T13:56:50Z INFO Coordinator Node appears to be synced; latest=8221062 confirmed=8221062
inx-tendercoo | 2023-02-22T13:56:50Z INFO Coordinator Coordinator resumed {"state": {"MilestoneHeight":3943404,"MilestoneIndex":8221045,"LastMilestoneID":"7414cf9d454b1a6085d6f17dfe982df95e796b364d7b68abf6acfbb4e3779a04","LastMilestoneBlockID":"8e95bb9d3bbad64b8b7923d10e545ae2236032214b121417ab689d8f5f8b2857"}}
worked perfectly :ok_hand:
retested with 10min downtime. Resuming worked perfectly.
Closing this issue.
I tested to stop hornet for a couple of minutes (about 3) and restarted both again.
It seems, the detection of a synced hornet doesn't work.
After tendercoo synced the tendermint blockchain, it crashed with:
Restarting tendercoo on a synced hornet works without issues.
This are the logs: coo.log tendercoo.log