This PR address the issue of mission status reporting when the edge cluster goes offline with a light-weight solution.
With this PR:
When an edge cluster goes offline, the mission state about the offline clusters in the mission objects will be set to "cluster offline". And the state of the underlying edgeclusters of the offline clusters will be removed. we call this process Mission State Pruning.
When the edge cluster comes back to life, the mission state of these edge clusters will be updated.
To achieve this, the code changes are presented in this PR:
A new module, MissionStatePruner, is added to cloudcore. It periodically checks the last heart beat time of edgeclusters. If an edgecluster has no heartbeat for a long period ( it is 1 minute by default), this edge cluster is deemed as offline and this module does the mission state pruning about this edge cluster.
Changed the behavior of mission status reporting in clusterd module. Previously, the mission state is reported only if there is any change. The new behavior is to report the state if it the state has not been reported for a long period (we set it to one minute), even if there is no change.
Verification
start an kubeedge cluster with two cascading edgeclusters (layer-I and layer-II in the direction from cloud to edge). A mission is deployed to the edge clusters. The mission state is as follows:
Turn off the edgecore process in the layer-I and wait for 1 minute. The mission state is as follows:
restart the edgecore process in the layer-I and wait for 1 minute. The mission state is as follows:
This PR address the issue of mission status reporting when the edge cluster goes offline with a light-weight solution.
With this PR:
To achieve this, the code changes are presented in this PR:
Verification