Open infinitydon opened 6 months ago
Hi!
You don't need to create K0sControllerConfig
manually, k0smotron will create it automatically based on K0sControlPlane
. This should solve the issue.
@makhov - Thanks for the response, how will the machine bootstrap be configured if K0sControllerConfig should not be used?
apiVersion: cluster.x-k8s.io/v1beta1
kind: Machine
metadata:
name: proxmox-cp-5g-pool-0
namespace: default
spec:
clusterName: core-5g-cp-cluster
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: K0sControllerConfig
name: proxmox-cp-5g-pool-0
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: RemoteMachine
name: proxmox-cp-5g-pool-0
Also has this kind of scenario been tested before? It will be great to have a sample in the docs
k0smotron generates K0sControllerConfig
, as well as Machine
and RemoteMachine
objects automatically based on K0sControlPlane
object.
Note, that K0sControlPlane.spec.machineTemplate.infrastructureRef
should contain ref to an actual machine template, not the machine. Since it's kind of against the nature of the RemoteMachines, we introduced PooledRemoteMachine
which can be used with RemoteMachineTemplate
. Here are some docs and an example of how to use it:
https://docs.k0smotron.io/stable/capi-remote/#using-remotemachines-in-machinetemplates-of-higher-level-objects
Here is the full example that we use in our tests: https://github.com/k0sproject/k0smotron/blob/main/inttest/capi-remote-machine-template/capi_remote_machine_template_test.go#L262
I used the following now:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: core-5g-cp-cluster
namespace: default
spec:
clusterNetwork:
pods:
cidrBlocks:
- 172.16.0.0/16
serviceDomain: cluster.local
services:
cidrBlocks:
- 10.128.0.0/12
controlPlaneRef:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0sControlPlane
name: core-5g-cp-cluster
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: RemoteCluster
name: core-5g-cp-cluster
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: RemoteCluster
metadata:
name: core-5g-cp-cluster
namespace: default
spec:
controlPlaneEndpoint:
host: 192.168.100.201
port: 6443
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0sControlPlane
metadata:
name: core-5g-cp-cluster
spec:
replicas: 1
version: v1.27.1+k0s.0
k0sConfigSpec:
k0s:
apiVersion: k0s.k0sproject.io/v1beta1
kind: ClusterConfig
metadata:
name: k0s
spec:
api:
address: 192.168.100.201
port: 6443
extraArgs:
anonymous-auth: "true"
network:
provider: custom
extensions:
helm:
concurrencyLevel: 5
repositories:
- name: cilium
url: https://helm.cilium.io/
charts:
- name: cilium
chartname: cilium/cilium
namespace: kube-system
version: "1.15.0"
values: |
operator:
replicas: 1
machineTemplate:
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: RemoteMachineTemplate
name: proxmox-cp-5g-pool-0
namespace: default
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: RemoteMachineTemplate
metadata:
name: proxmox-cp-5g-pool-0
namespace: default
spec:
template:
spec:
pool: controlplane
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: PooledRemoteMachine
metadata:
name: proxmox-cp-5g-pool-0
namespace: default
spec:
pool: controlplane
machine:
address: 192.168.100.201
port: 22
user: root
sshKeyRef:
name: prox-key
But there are no nodes:
kubectl --kubeconfig /tmp/kubeconfig_k0smotron get no -o wide
No resources found
ubuntu@cpamox-clusterapi-mgmt-node:~/vm-proxmox-terraform$ kubectl --kubeconfig /tmp/kubeconfig_k0smotron get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-operator-78bc59578b-z95qg 0/1 Pending 0 69s
kube-system coredns-878bb57ff-6f4qx 0/1 Pending 0 69s
kube-system metrics-server-7f86dff975-hblb7 0/1 Pending 0 69s
projectsveltos sveltos-agent-manager-c5784b88b-tfrv9 0/1 Pending 0 69s
On the remote machine, k0s status:
k0s status
Version: v1.27.1+k0s.0
Process ID: 9875
Role: controller
Workloads: false
SingleNode: false
containerd seems not to be working also
k0s ctr c ls
Error: failed to dial "/run/k0s/containerd.sock": context deadline exceeded: connection error: desc = "transport: error while dialing: dial unix:///run/k0s/containerd.sock: timeout"
Usage:
k0s ctr [flags]
Flags:
-h, --help help for ctr
Is there any further config to make the k8s controlplane node show up?
Great, the controlplane is working now. By default, k0s doesn't run any workloads on the control nodes. To enable it you need to add --enable-worker
arg.
Note, that in this case the node-role.kubernetes.io/master:NoExecute
taint will be added automatically to the worker node. To disable it, use --no-taints
. More info: https://docs.k0sproject.io/stable/worker-node-config/?h=tain#taints
An example:
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: K0sControlPlane
metadata:
name: test-cp
spec:
replicas: 1
k0sConfigSpec:
args:
- --enable-worker
- --no-taints # disable default taints
- ```
Thanks, works now and I am able to list the nodes.
While it is working, the k0smotron controller logs is still showing errors that provisioning may not be complete:
DEBUG [ssh] 192.168.100.201:22: executing `env -i LC_ALL=C stat -c '%#f %s %.9Y //%n//' -- /k0s 2> /dev/null`
DEBUG [ssh] 192.168.100.201:22: 0x41a4 4096 1716592134.095046379 ///k0s//
DEBUG [ssh] 192.168.100.201:22: executing `env -i LC_ALL=C stat -c '%#f %s %.9Y //%n//' -- /k0s/k0sleave-openrc 2> /dev/null`
DEBUG [ssh] 192.168.100.201:22: 0x81a4 106 1716742261.993995198 ///k0s/k0sleave-openrc//
DEBUG [ssh] 192.168.100.201:22: executing `truncate -s 0 /k0s/k0sleave-openrc`
DEBUG [ssh] 192.168.100.201:22: executing `env -i LC_ALL=C stat -c '%#f %s %.9Y //%n//' -- /k0s/k0sleave-openrc 2> /dev/null`
DEBUG [ssh] 192.168.100.201:22: 0x81a4 0 1716742262.757989595 ///k0s/k0sleave-openrc//
DEBUG [ssh] 192.168.100.201:22: executing `stat -c "%s" /k0s 2> /dev/null || stat -f "%k" /k0s`
DEBUG [ssh] 192.168.100.201:22: 4096
DEBUG [ssh] 192.168.100.201:22: executing `dd if=/dev/stdin of=/k0s/k0sleave-openrc bs=1 count=106 seek=0 conv=notrunc`
2024-05-26T16:51:02Z INFO uploaded file {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0", "path": "/k0s/k0sleave-openrc", "permissions": 420}
DEBUG [ssh] 192.168.100.201:22: executing `curl -sSfL https://get.k0s.sh | K0S_VERSION=v1.27.1+k0s.0 sh`
DEBUG [ssh] 192.168.100.201:22: Downloading k0s from URL: https://github.com/k0sproject/k0s/releases/download/v1.27.1+k0s.0/k0s-v1.27.1+k0s.0-amd64
2024-05-26T16:51:02Z ERROR failed to run command {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0", "output": "Downloading k0s from URL: https://github.com/k0sproject/k0s/releases/download/v1.27.1+k0s.0/k0s-v1.27.1+k0s.0-amd64", "error": "command failed: client exec: ssh session wait: Process exited with status 2"}
github.com/k0sproject/k0smotron/internal/controller/infrastructure.(*Provisioner).Provision
/workspace/internal/controller/infrastructure/provisioner.go:94
github.com/k0sproject/k0smotron/internal/controller/infrastructure.(*RemoteMachineController).Reconcile
/workspace/internal/controller/infrastructure/remote_machine_controller.go:225
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:227
2024-05-26T16:51:02Z ERROR Failed to provision RemoteMachine {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0", "error": "failed to run command: command failed: client exec: ssh session wait: Process exited with status 2"}
github.com/k0sproject/k0smotron/internal/controller/infrastructure.(*RemoteMachineController).Reconcile
/workspace/internal/controller/infrastructure/remote_machine_controller.go:227
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:227
2024-05-26T16:51:02Z INFO Reconcile complete {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0"}
2024-05-26T16:51:02Z INFO Updating RemoteMachine status: {Ready:false FailureReason:ProvisionFailed FailureMessage:failed to run command: command failed: client exec: ssh session wait: Process exited with status 2} {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0"}
2024-05-26T16:51:02Z ERROR Reconciler error {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "error": "failed to run command: command failed: client exec: ssh session wait: Process exited with status 2"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:227`
kubectl describe remotemachine
Name: core-5g-cp-cluster-0
Namespace: default
Labels: cluster.x-k8s.io/cluster-name=core-5g-cp-cluster
cluster.x-k8s.io/control-plane=
cluster.x-k8s.io/control-plane-name=core-5g-cp-cluster
Annotations: cluster.x-k8s.io/cloned-from-groupkind: RemoteMachineTemplate.infrastructure.cluster.x-k8s.io
cluster.x-k8s.io/cloned-from-name: proxmox-cp-5g-pool-0
API Version: infrastructure.cluster.x-k8s.io/v1beta1
Kind: RemoteMachine
Metadata:
Creation Timestamp: 2024-05-26T16:43:44Z
Generation: 1
Owner References:
API Version: cluster.x-k8s.io/v1beta1
Block Owner Deletion: true
Controller: true
Kind: Machine
Name: core-5g-cp-cluster-0
UID: ed9964aa-7455-47d8-b5b1-0ce7e5dbccd1
Resource Version: 4222048
UID: b5d4f4db-b132-419a-827e-fcd5c1d8e071
Spec:
Pool: controlplane
Port: 22
User: root
Status:
Failure Message: failed to run command: command failed: client exec: ssh session wait: Process exited with status 2
Failure Reason: ProvisionFailed
Events: <none>
kubectl describe machine
Name: core-5g-cp-cluster-0
Namespace: default
Labels: cluster.x-k8s.io/cluster-name=core-5g-cp-cluster
cluster.x-k8s.io/control-plane=true
cluster.x-k8s.io/generateMachine-role=control-plane
Annotations: <none>
API Version: cluster.x-k8s.io/v1beta1
Kind: Machine
Metadata:
Creation Timestamp: 2024-05-26T16:43:44Z
Finalizers:
machine.cluster.x-k8s.io
Generation: 2
Owner References:
API Version: controlplane.cluster.x-k8s.io/v1beta1
Block Owner Deletion: true
Controller: true
Kind: K0sControlPlane
Name: core-5g-cp-cluster
UID: 56e95547-3a58-46da-817f-a0485880933d
Resource Version: 4222052
UID: ed9964aa-7455-47d8-b5b1-0ce7e5dbccd1
Spec:
Bootstrap:
Config Ref:
API Version: bootstrap.cluster.x-k8s.io/v1beta1
Kind: K0sControllerConfig
Name: core-5g-cp-cluster-0
Namespace: default
Data Secret Name: core-5g-cp-cluster-0
Cluster Name: core-5g-cp-cluster
Infrastructure Ref:
API Version: infrastructure.cluster.x-k8s.io/v1beta1
Kind: RemoteMachine
Name: core-5g-cp-cluster-0
Namespace: default
Node Deletion Timeout: 10s
Version: v1.27.1
Status:
Bootstrap Ready: true
Conditions:
Last Transition Time: 2024-05-26T16:43:46Z
Message: 1 of 2 completed
Reason: WaitingForInfrastructure
Severity: Info
Status: False
Type: Ready
Last Transition Time: 2024-05-26T16:43:46Z
Status: True
Type: BootstrapReady
Last Transition Time: 2024-05-26T16:43:45Z
Reason: WaitingForInfrastructure
Severity: Info
Status: False
Type: InfrastructureReady
Last Transition Time: 2024-05-26T16:43:45Z
Reason: WaitingForNodeRef
Severity: Info
Status: False
Type: NodeHealthy
Failure Message: Failure detected from referenced resource infrastructure.cluster.x-k8s.io/v1beta1, Kind=RemoteMachine with name "core-5g-cp-cluster-0": failed to run command: command failed: client exec: ssh session wait: Process exited with status 2
Failure Reason: ProvisionFailed
Last Updated: 2024-05-26T16:44:09Z
Observed Generation: 2
Phase: Failed
Events: <none>
ubuntu@cpamox-clusterapi-mgmt-node:~$
ubuntu@cpamox-clusterapi-mgmt-node:~$
ubuntu@cpamox-clusterapi-mgmt-node:~$
ubuntu@cpamox-clusterapi-mgmt-node:~$ kubectl --kubeconfig /tmp/kubeconfig_k0smotron get no -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
core-5g-cp-cluster-0 Ready control-plane 5m31s v1.27.1+k0s 192.168.100.201 <none> Ubuntu 20.04.6 LTS 5.4.0-182-generic containerd://1.7.0
ubuntu@cpamox-clusterapi-mgmt-node:~$
ubuntu@cpamox-clusterapi-mgmt-node:~$
ubuntu@cpamox-clusterapi-mgmt-node:~$ kubectl --kubeconfig /tmp/kubeconfig_k0smotron get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-8jvjz 1/1 Running 0 5m40s
kube-system cilium-operator-78bc59578b-pblc2 1/1 Running 0 5m41s
kube-system coredns-878bb57ff-pvgqc 1/1 Running 0 5m41s
kube-system konnectivity-agent-kgtxl 1/1 Running 0 5m40s
kube-system kube-proxy-cmz4k 1/1 Running 0 5m40s
kube-system metrics-server-7f86dff975-57vfb 1/1 Running 0 5m41s
projectsveltos sveltos-agent-manager-c5784b88b-dltrh 1/1 Running 0 5m41s
root@k0smotron-cp-node-2:~# k0s status
Version: v1.27.1+k0s.0
Process ID: 22537
Role: controller
Workloads: true
SingleNode: false
Kube-api probing successful: true
Kube-api probing last error:
Also the following doc to the advanced worker configuration is not working https://docs.k0sproject.io/stable/advanced/worker-configuration/
Sorry, the link is not working. Which doc are you referring to and what exactly is not working?
The controlplane node was created but the status still shows that provisioning failed asper what I posted
Thanks, works now and I am able to list the nodes.
While it is working, the k0smotron controller logs is still showing errors that provisioning may not be complete:
DEBUG [ssh] 192.168.100.201:22: executing `env -i LC_ALL=C stat -c '%#f %s %.9Y //%n//' -- /k0s 2> /dev/null` DEBUG [ssh] 192.168.100.201:22: 0x41a4 4096 1716592134.095046379 ///k0s// DEBUG [ssh] 192.168.100.201:22: executing `env -i LC_ALL=C stat -c '%#f %s %.9Y //%n//' -- /k0s/k0sleave-openrc 2> /dev/null` DEBUG [ssh] 192.168.100.201:22: 0x81a4 106 1716742261.993995198 ///k0s/k0sleave-openrc// DEBUG [ssh] 192.168.100.201:22: executing `truncate -s 0 /k0s/k0sleave-openrc` DEBUG [ssh] 192.168.100.201:22: executing `env -i LC_ALL=C stat -c '%#f %s %.9Y //%n//' -- /k0s/k0sleave-openrc 2> /dev/null` DEBUG [ssh] 192.168.100.201:22: 0x81a4 0 1716742262.757989595 ///k0s/k0sleave-openrc// DEBUG [ssh] 192.168.100.201:22: executing `stat -c "%s" /k0s 2> /dev/null || stat -f "%k" /k0s` DEBUG [ssh] 192.168.100.201:22: 4096 DEBUG [ssh] 192.168.100.201:22: executing `dd if=/dev/stdin of=/k0s/k0sleave-openrc bs=1 count=106 seek=0 conv=notrunc` 2024-05-26T16:51:02Z INFO uploaded file {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0", "path": "/k0s/k0sleave-openrc", "permissions": 420} DEBUG [ssh] 192.168.100.201:22: executing `curl -sSfL https://get.k0s.sh | K0S_VERSION=v1.27.1+k0s.0 sh` DEBUG [ssh] 192.168.100.201:22: Downloading k0s from URL: https://github.com/k0sproject/k0s/releases/download/v1.27.1+k0s.0/k0s-v1.27.1+k0s.0-amd64 2024-05-26T16:51:02Z ERROR failed to run command {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0", "output": "Downloading k0s from URL: https://github.com/k0sproject/k0s/releases/download/v1.27.1+k0s.0/k0s-v1.27.1+k0s.0-amd64", "error": "command failed: client exec: ssh session wait: Process exited with status 2"} github.com/k0sproject/k0smotron/internal/controller/infrastructure.(*Provisioner).Provision /workspace/internal/controller/infrastructure/provisioner.go:94 github.com/k0sproject/k0smotron/internal/controller/infrastructure.(*RemoteMachineController).Reconcile /workspace/internal/controller/infrastructure/remote_machine_controller.go:225 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:119 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:316 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:266 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:227 2024-05-26T16:51:02Z ERROR Failed to provision RemoteMachine {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0", "error": "failed to run command: command failed: client exec: ssh session wait: Process exited with status 2"} github.com/k0sproject/k0smotron/internal/controller/infrastructure.(*RemoteMachineController).Reconcile /workspace/internal/controller/infrastructure/remote_machine_controller.go:227 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:119 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:316 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:266 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:227 2024-05-26T16:51:02Z INFO Reconcile complete {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0"} 2024-05-26T16:51:02Z INFO Updating RemoteMachine status: {Ready:false FailureReason:ProvisionFailed FailureMessage:failed to run command: command failed: client exec: ssh session wait: Process exited with status 2} {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "remotemachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "machine": "core-5g-cp-cluster-0"} 2024-05-26T16:51:02Z ERROR Reconciler error {"controller": "remotemachine", "controllerGroup": "infrastructure.cluster.x-k8s.io", "controllerKind": "RemoteMachine", "RemoteMachine": {"name":"core-5g-cp-cluster-0","namespace":"default"}, "namespace": "default", "name": "core-5g-cp-cluster-0", "reconcileID": "52f3283a-d77a-4d6a-b74e-b659d2ccebe5", "error": "failed to run command: command failed: client exec: ssh session wait: Process exited with status 2"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:329 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:266 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.5/pkg/internal/controller/controller.go:227`
kubectl describe remotemachine Name: core-5g-cp-cluster-0 Namespace: default Labels: cluster.x-k8s.io/cluster-name=core-5g-cp-cluster cluster.x-k8s.io/control-plane= cluster.x-k8s.io/control-plane-name=core-5g-cp-cluster Annotations: cluster.x-k8s.io/cloned-from-groupkind: RemoteMachineTemplate.infrastructure.cluster.x-k8s.io cluster.x-k8s.io/cloned-from-name: proxmox-cp-5g-pool-0 API Version: infrastructure.cluster.x-k8s.io/v1beta1 Kind: RemoteMachine Metadata: Creation Timestamp: 2024-05-26T16:43:44Z Generation: 1 Owner References: API Version: cluster.x-k8s.io/v1beta1 Block Owner Deletion: true Controller: true Kind: Machine Name: core-5g-cp-cluster-0 UID: ed9964aa-7455-47d8-b5b1-0ce7e5dbccd1 Resource Version: 4222048 UID: b5d4f4db-b132-419a-827e-fcd5c1d8e071 Spec: Pool: controlplane Port: 22 User: root Status: Failure Message: failed to run command: command failed: client exec: ssh session wait: Process exited with status 2 Failure Reason: ProvisionFailed Events: <none> kubectl describe machine Name: core-5g-cp-cluster-0 Namespace: default Labels: cluster.x-k8s.io/cluster-name=core-5g-cp-cluster cluster.x-k8s.io/control-plane=true cluster.x-k8s.io/generateMachine-role=control-plane Annotations: <none> API Version: cluster.x-k8s.io/v1beta1 Kind: Machine Metadata: Creation Timestamp: 2024-05-26T16:43:44Z Finalizers: machine.cluster.x-k8s.io Generation: 2 Owner References: API Version: controlplane.cluster.x-k8s.io/v1beta1 Block Owner Deletion: true Controller: true Kind: K0sControlPlane Name: core-5g-cp-cluster UID: 56e95547-3a58-46da-817f-a0485880933d Resource Version: 4222052 UID: ed9964aa-7455-47d8-b5b1-0ce7e5dbccd1 Spec: Bootstrap: Config Ref: API Version: bootstrap.cluster.x-k8s.io/v1beta1 Kind: K0sControllerConfig Name: core-5g-cp-cluster-0 Namespace: default Data Secret Name: core-5g-cp-cluster-0 Cluster Name: core-5g-cp-cluster Infrastructure Ref: API Version: infrastructure.cluster.x-k8s.io/v1beta1 Kind: RemoteMachine Name: core-5g-cp-cluster-0 Namespace: default Node Deletion Timeout: 10s Version: v1.27.1 Status: Bootstrap Ready: true Conditions: Last Transition Time: 2024-05-26T16:43:46Z Message: 1 of 2 completed Reason: WaitingForInfrastructure Severity: Info Status: False Type: Ready Last Transition Time: 2024-05-26T16:43:46Z Status: True Type: BootstrapReady Last Transition Time: 2024-05-26T16:43:45Z Reason: WaitingForInfrastructure Severity: Info Status: False Type: InfrastructureReady Last Transition Time: 2024-05-26T16:43:45Z Reason: WaitingForNodeRef Severity: Info Status: False Type: NodeHealthy Failure Message: Failure detected from referenced resource infrastructure.cluster.x-k8s.io/v1beta1, Kind=RemoteMachine with name "core-5g-cp-cluster-0": failed to run command: command failed: client exec: ssh session wait: Process exited with status 2 Failure Reason: ProvisionFailed Last Updated: 2024-05-26T16:44:09Z Observed Generation: 2 Phase: Failed Events: <none>
ubuntu@cpamox-clusterapi-mgmt-node:~$ ubuntu@cpamox-clusterapi-mgmt-node:~$ ubuntu@cpamox-clusterapi-mgmt-node:~$ ubuntu@cpamox-clusterapi-mgmt-node:~$ kubectl --kubeconfig /tmp/kubeconfig_k0smotron get no -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME core-5g-cp-cluster-0 Ready control-plane 5m31s v1.27.1+k0s 192.168.100.201 <none> Ubuntu 20.04.6 LTS 5.4.0-182-generic containerd://1.7.0 ubuntu@cpamox-clusterapi-mgmt-node:~$ ubuntu@cpamox-clusterapi-mgmt-node:~$ ubuntu@cpamox-clusterapi-mgmt-node:~$ kubectl --kubeconfig /tmp/kubeconfig_k0smotron get po -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system cilium-8jvjz 1/1 Running 0 5m40s kube-system cilium-operator-78bc59578b-pblc2 1/1 Running 0 5m41s kube-system coredns-878bb57ff-pvgqc 1/1 Running 0 5m41s kube-system konnectivity-agent-kgtxl 1/1 Running 0 5m40s kube-system kube-proxy-cmz4k 1/1 Running 0 5m40s kube-system metrics-server-7f86dff975-57vfb 1/1 Running 0 5m41s projectsveltos sveltos-agent-manager-c5784b88b-dltrh 1/1 Running 0 5m41s root@k0smotron-cp-node-2:~# k0s status Version: v1.27.1+k0s.0 Process ID: 22537 Role: controller Workloads: true SingleNode: false Kube-api probing successful: true Kube-api probing last error:
For the link it is referenced in the docs https://docs.k0smotron.io/stable/resource-reference/#k0scontrolplanespeck0sconfigspec
The controlplane node was created but the status still shows that provisioning failed asper what I posted
Yeah, this false positive ProvisionFailed
errors unfortunately sometimes happens, we're working on making it more stable.
For the link it is referenced in the docs https://docs.k0smotron.io/stable/resource-reference/#k0scontrolplanespeck0sconfigspec
Thanks! This is definitely a mistake in our docs, I will fix it!
Hi,
Currently trying to create k8s control-plane node on a dedicated VM using RemoteMachine (SSH), I will like to know if the configuration is correct:
Also noticed that after applying the config, the config seems to be causing crashloopback of some PODs that are in k0smotron namespace:
Some logs: