CentaurusInfra / fornax

Fornax for autonomous and flexible edge computing
Apache License 2.0
8 stars 16 forks source link

mission state pruner in cloud & force mission state report after long idleness in clusterd #8

Closed chenqianfzh closed 3 years ago

chenqianfzh commented 3 years ago

This PR address the issue of mission status reporting when the edge cluster goes offline with a light-weight solution.

With this PR:

  1. When an edge cluster goes offline, the mission state about the offline clusters in the mission objects will be set to "cluster offline". And the state of the underlying edgeclusters of the offline clusters will be removed. we call this process Mission State Pruning.
  2. When the edge cluster comes back to life, the mission state of these edge clusters will be updated.

To achieve this, the code changes are presented in this PR:

  1. A new module, MissionStatePruner, is added to cloudcore. It periodically checks the last heart beat time of edgeclusters. If an edgecluster has no heartbeat for a long period ( it is 1 minute by default), this edge cluster is deemed as offline and this module does the mission state pruning about this edge cluster.
  2. Changed the behavior of mission status reporting in clusterd module. Previously, the mission state is reported only if there is any change. The new behavior is to report the state if it the state has not been reported for a long period (we set it to one minute), even if there is no change.

Verification

  1. start an kubeedge cluster with two cascading edgeclusters (layer-I and layer-II in the direction from cloud to edge). A mission is deployed to the edge clusters. The mission state is as follows:

image

  1. Turn off the edgecore process in the layer-I and wait for 1 minute. The mission state is as follows:

image

  1. restart the edgecore process in the layer-I and wait for 1 minute. The mission state is as follows:

image