Closed pfaelzerchen closed 7 months ago
To add one more thing: I tried to simulate a defect node and therefore destroyed the partition table of tick, reinstalled a fresh OS (Ubuntu 22.04 LTS) and did a rerun of the existing k3s playbook. It starts to reinstall things on tick, but I basically got the same result: tick is up and running a new cluster, trick and track are defect.
I did some more experiments. It seems that the role really relies on the first node in the inventory being present:
tick is successfully out of the cluster.
tick is again present in the cluster. Everything works fine.
The rollout does some changes, but the cluster stays fully functional as expected in the first place.
So probably this isn't a bug in the code, but something for documentation. Using a HA deployment with three control nodes, this behaviour came definitely unexpected.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Not stale
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Summary
I've set up a three node HA cluster with etcd following the quickstart guide with hosts tick, trick and track. Then I wanted to test how to take single nodes from out (e.g. to install new ubuntu lts releases) and get them back to the cluster. This works fine with trick and track, but not with tick.
I'm relatively new to ansible and k3s, so sorry if I didn't see something obvious.
Issue Type
Controller Environment and Configuration
I'm using v3.4.2 from ansible-galaxy. Following the dump from shrinking.
Steps to Reproduce
Playbook:
Inventory:
Expected Result
The cluster is up and running with three nodes and using the existing certificates.
Actual Result
tick is up and running a one node cluster, trick and track are unable to start k3s. The systemd unit fails on the host.
I also copied the kubectl configuration on my local machine. Locally, I cannot connect with kubectl any more as the certificates are wrong. So it seems tick got a completely new installation with new certificates. After steps 1 and 2 the cluster was still reachable with the existing certificates.