Switching from single master cluster to multi master HA cluster

Expected Behavior

I am trying to initially create a cluster with single master and three workers and then add two more masters in order to turn the cluster into HA mode. The reason is that I need to use a simple cluster at first and then expand it. I'm testing all this stuff using a vagrant environment.

I do not know if this approach is correct, but (especially for me) it would be very convenient to be able to switch from a cluster with a single master to one in HA by simply re-running the playbook.

Current Behavior

So I initially created a cluster with single master and three workers and it works perfectly. Trying to run the process again by adding two masters to the hosts.ini file I get an error when checking that all the nodes are joined. After 20 retries the playbooks stops. So I checked the k3s-init.service in the master machines and I noticed that I constantly have this error:

starting kubernetes: preparing server: failed to validate server configuration: https://192.168.56.101:6443/v1-k3s/config: 401 Unauthorized

Steps to Reproduce

Create the environment using vagrant up
Run ansible-playbook site.yml -i inventory/my-cluster/hosts.ini
Add two more masters to the hosts.ini
Run the playbook again.

Context (variables)

Variables Used

all.yml

---
k3s_version: v1.24.9+k3s1
# this is the user that has ssh access to these machines
ansible_user: vagrant
systemd_dir: /etc/systemd/system

# Set your timezone
system_timezone: "Europe/Rome"

# interface which will be used for flannel
flannel_iface: "eth1"

# apiserver_endpoint is virtual ip-address which will be configured on each master
apiserver_endpoint: "192.168.56.222"

# k3s_token is required  masters can talk together securely
# this token should be alpha numeric only
k3s_token: "changeme"

# The IP on which the node is reachable in the cluster.
# Here, a sensible default is provided, you can still override
# it for each of your hosts, though.
k3s_node_ip: '{{ ansible_facts[flannel_iface]["ipv4"]["address"] }}'

# Disable the taint manually by setting: k3s_master_taint = false
k3s_master_taint: "{{ true if groups['node'] | default([]) | length >= 1 else false }}"

# these arguments are recommended for servers as well as agents:
extra_args: >-
  --flannel-iface={{ flannel_iface }}
  --node-ip={{ k3s_node_ip }}

# change these to your liking, the only required are: --disable servicelb, --tls-san {{ apiserver_endpoint }}
extra_server_args: >-
  {{ extra_args }}
  --node-ip={{ ansible_eth1.ipv4.address }} --flannel-iface={{ flannel_iface }}
  {{ '--node-taint node-role.kubernetes.io/master=true:NoSchedule' if k3s_master_taint else '' }}
  --tls-san {{ apiserver_endpoint }}
  --disable servicelb
  --disable traefik
extra_agent_args: >-
  {{ extra_args }}

# image tag for kube-vip
kube_vip_tag_version: "v0.5.7"

# image tag for metal lb
metal_lb_speaker_tag_version: "v0.13.7"
metal_lb_controller_tag_version: "v0.13.7"

# metallb ip range for load balancer
metal_lb_ip_range: "192.168.56.80-192.168.56.100"

Hosts

host.ini

[master1]
192.168.56.101

[master1:vars]
ansible_ssh_private_key_file=/path/to//vagrant/.vagrant/machines/master1/virtualbox/private_key

[master2]
192.168.56.102

[master2:vars]
ansible_ssh_private_key_file=/path/to/vagrant/.vagrant/machines/master2/virtualbox/private_key

[master3]
192.168.56.103

[master3:vars]
ansible_ssh_private_key_file=/path/to/vagrant/.vagrant/machines/master3/virtualbox/private_key

[worker1]
192.168.56.201

[worker1:vars]
ansible_ssh_private_key_file=/path/to/vagrant/.vagrant/machines/worker1/virtualbox/private_key

[worker2]
192.168.56.202

[worker2:vars]
ansible_ssh_private_key_file=/path/to/vagrant/.vagrant/machines/worker2/virtualbox/private_key

[worker3]
192.168.56.203

[worker3:vars]
ansible_ssh_private_key_file=/path/to/vagrant/.vagrant/machines/worker3/virtualbox/private_key

[master:children]
master1
# Decomment the following lines after the single master playbook run
# master2 
# master3 

[node:children]
worker1
worker2
worker3

[k3s_cluster:children]
master
node

[k3s_cluster:vars]
ansible_ssh_user=vagrant

Vagrantfile

Vagrantfile

# -*- mode: ruby -*-
# vi: set ft=ruby :

base_image_name = "alvistack/ubuntu-22.04"
master_image_name = base_image_name
worker_image_name = base_image_name
master_base_name = "master"
worker_base_name = "worker"
ram = 1024
cpu = 2

masters = [
  {
    :hostname => master_base_name + "1",
    :ip => "192.168.56.101",
    :box => master_image_name,
    :ram => ram,
    :cpu => cpu
  },
  {
    :hostname => master_base_name + "2",
    :ip => "192.168.56.102",
    :box => master_image_name,
    :ram => ram,
    :cpu => cpu
  },
  {
    :hostname => master_base_name + "3",
    :ip => "192.168.56.103",
    :box => master_image_name,
    :ram => ram,
    :cpu => cpu
  }
]

workers = [
  {
    :hostname => worker_base_name + "1",
    :ip => "192.168.56.201",
    :box => worker_image_name,
    :ram => ram,
    :cpu => cpu
  },
  {
    :hostname => worker_base_name + "2",
    :ip => "192.168.56.202",
    :box => worker_image_name,
    :ram => ram,
    :cpu => cpu
  },
  {
    :hostname => worker_base_name + "3",
    :ip => "192.168.56.203",
    :box => worker_image_name,
    :ram => ram,
    :cpu => cpu
  }
]

servers = [
  *masters,
  *workers
]

Vagrant.configure(2) do |config|
  servers.each do |machine|
    config.vm.define machine[:hostname] do |node|
      node.vm.box = machine[:box]
      node.vm.hostname = machine[:hostname]
      node.vm.network "private_network", ip: machine[:ip]
      node.vm.provider "virtualbox" do |vb|
        vb.customize ["modifyvm", :id, "--memory", machine[:ram]]
        vb.memory = machine[:ram]
        vb.cpus = machine[:cpu]
      end
    end
  end
end

Possible Solution

I found on the k3s doc that for an existing cluster installation Is is necessary to restart the k3s server with the --cluster-init. I tried this but seems not work for me, probably I'm doing something wrong.

[x] I've checked the General Troubleshooting Guide

techno-tim / k3s-ansible