PyratLabs / ansible-role-k3s

Ansible role for installing k3s as either a standalone server or HA cluster.
BSD 3-Clause "New" or "Revised" License
628 stars 135 forks source link

bad certificate - Waiting for control-plane node #158

Closed tbrtje closed 2 years ago

tbrtje commented 2 years ago

Summary

When creating a single-node-cluster, the cluster won't start. k3s is up and running, but kubectl shows no nodes. The k3s-service logs the following lines over and over:

level=info msg="Waiting for control-plane node kube-master startup: nodes \"kube-master\" not found"
level=info msg="Cluster-Http-Server http: TLS handshake error from 127.0.0.1:54348: remote error: tls: bad certificate"

Issue Type

Controller Environment and Configuration

attached to this issue

v2.11

Steps to Reproduce

Run this role with the following vars, with ansible_host being the ip of the ubuntu-node.

k3s_server:
  advertise-address: "{{ ansible_host }}"
  disable:
  - traefik
k3s_agent:
  node-ip: "{{ ansible_host }}"
  node-external-ip: "{{ ansible_host }}"
k3s_release_version: stable
k3s_become_for_all: true

Expected Result

I expected a running single-node cluster running on said host.

Actual Result

The api-server starts up, but no node is registered, therefore no workloads are sheduled.

level=info msg="Waiting for control-plane node kube-master startup: nodes \"kube-master\" not found"
level=info msg="Cluster-Http-Server http: TLS handshake error from 127.0.0.1:54348: remote error: tls: bad certificate"
xanmanning commented 2 years ago

Have you tried setting tls-san

k3s_server:
  advertise-address: "{{ ansible_host }}"
  tls-san: "{{ ansible_host }}"
  disable: 
    - traefik
k3s_agent:
  node-ip: "{{ ansible_host }}"
  node-external-ip: "{{ ansible_host }}"
k3s_release_version: stable
k3s_become_for_all: true

https://rancher.com/docs/k3s/latest/en/installation/install-options/server-config/#listeners

tbrtje commented 2 years ago

Thank you for your help :) It turns out the issue had nothing to do with the certificate. The node just didn't start, as it was running in an LXC-Container which wasn't able to utilise overlayfs. Changing the snapshotter to native did the trick.