gravitational / teleport

The easiest, and most secure way to access and protect all of your infrastructure.
https://goteleport.com
GNU Affero General Public License v3.0
17.33k stars 1.74k forks source link

Node installation problem on k8s node on which teleport is currently running as a pod. #17828

Closed TheAnachronism closed 1 year ago

TheAnachronism commented 1 year ago

To describe the setup, I have a bare-metal k8s cluster where the different nodes are VMs on multiple physical servers. I'd like to add these VMs to the teleport cluster, but the problem is, that the teleport cluster pod inside k8s is currently running on one of the worker nodes. The teleport installation script seems to be checking if anything teleport related is running, and somehow it thinks the teleport cluster pod is the process it would overwrite. I can't just kill the process it finds, as that is the cluster I want to add the node to. A solution would be to force teleport to go to a different node, but it seems more like a workaround than a direct solution.

Expected behavior: The installation script that gets provided by the cluster easily installs teleport, even with the cluster process running in the pod of the same node.

Current behavior: The installation script finds the cluster process and won't continue installing.

Bug details:

2022-10-26 10:50:18 CEST [teleport-installer] TELEPORT_VERSION: 11.0.0
2022-10-26 10:50:18 CEST [teleport-installer] TARGET_HOSTNAME: XXX.XXX.XXX
2022-10-26 10:50:18 CEST [teleport-installer] TARGET_PORT: 443
2022-10-26 10:50:18 CEST [teleport-installer] JOIN_TOKEN: <MY_TOKEN>
2022-10-26 10:50:18 CEST [teleport-installer] CA_PIN_HASHES: sha256:30012130113303be5d47a0e0e9df49eb87690f237e8a2ffe7c2b221c20de0684
2022-10-26 10:50:18 CEST [teleport-installer] Checking TCP connectivity to Teleport server (teleport.anachronis.dev:443)
2022-10-26 10:50:18 CEST [teleport-installer] Connectivity to Teleport server (via nc) looks good
2022-10-26 10:50:18 CEST [teleport-installer] Detected host: linux-gnu, using Teleport binary type linux
2022-10-26 10:50:18 CEST [teleport-installer] Detected arch: x86_64, using Teleport arch amd64
2022-10-26 10:50:18 CEST [teleport-installer] Detected distro type: debian
2022-10-26 10:50:18 CEST [teleport-installer] Using Teleport distribution: deb
2022-10-26 10:50:18 CEST [teleport-installer] Created temp dir /tmp/teleport-4XlXwGaSZh

Warning: Teleport appears to already be running on this host (pid: 2342735)

This script does not overwrite any existing settings or Teleport installations.
Please clean up by running any of the following steps as necessary:
- stop any running Teleport processes
  - pkill -f teleport
- remove any data under /var/lib/teleport, along with the directory itself
  - rm -rf /var/lib/teleport
- remove any configuration at /etc/teleport.yaml
  - rm -f /etc/teleport.yaml
- remove any Teleport binaries (teleport tctl tsh) installed under /usr/local/bin
  - rm -f /usr/local/bin/teleport /usr/local/bin/tctl /usr/local/bin/tsh 
Run this installer again when done.
TheAnachronism commented 1 year ago

The only thing that helped, was moving the teleport-cluster pod to a different node (or maybe even drain the node) and then installing it again. Still weird that teleport cannot recognize that the other process is the k8s pod running and not a teleport node instance.