cnrancher / autok3s

Run K3s Everywhere
https://www.suse.com
Apache License 2.0
741 stars 76 forks source link

[Enhance] support partial rollback when creating cluster or joining nodes #650

Closed JacieChao closed 6 months ago

JacieChao commented 8 months ago

Is your feature request related to a problem? Please describe.

When using a cloud provider to create a cluster or join nodes to the cluster, the failure of any node will lead to rollback logic to ensure that no invalid instances running on the cloud environment. This method is not convenient for the native provider. In native cases, the user needs to prepare hosts before creating or joining nodes. When joining nodes in batches, only a node failure for some reason(e.g. wrong type of worker-ip) will cause all registered nodes to roll back. All nodes need to rejoin again after the problem is fixed.

Describe the solution you'd like

To simplify the operation of this kind of case, AutoK3s plans to support for partial rollback function. To cut down the repeated operations when joining nodes in batches.

Additional context

Firstly support partial rollback for the native provider.

Jason-ZW commented 8 months ago

In addition to the partial rollback function, the log output should also be enhanced, and the log output should correspond to the node identifier such as cluster-name or cluster-id, node-name or node-ip such as ([cluster-id] [node-ip] xxxxx).

JacieChao commented 6 months ago

Tested with v0.9.2-rc3. Close.