kubernetes-sigs / kind

Kubernetes IN Docker - local clusters for testing Kubernetes
https://kind.sigs.k8s.io/
Apache License 2.0
13.44k stars 1.56k forks source link

[podman] creating a kind cluster with more than one control-plane aborts due to error #2858

Open jensfr opened 2 years ago

jensfr commented 2 years ago

What happened: I tried to create a cluster with two control-plane nodes and I got an error message. Defining only one control-plan node but overall the same number nof nodes works.

I0803 10:56:02.608617 120 with_retry.go:241] Got a Retry-After 1s response for attempt 8 to https://coco-external-load-balancer:6443/healthz?timeout=10s I0803 10:56:02.610499 120 round_trippers.go:553] GET https://coco-external-load-balancer:6443/healthz?timeout=10s in 1 milliseconds I0803 10:56:03.610795 120 with_retry.go:241] Got a Retry-After 1s response for attempt 9 to https://coco-external-load-balancer:6443/healthz?timeout=10s I0803 10:56:03.613829 120 round_trippers.go:553] GET https://coco-external-load-balancer:6443/healthz?timeout=10s in 2 milliseconds

Full output: https://pastebin.com/MXNcbnUt

What you expected to happen: I expected that the cluster would be created without errors

How to reproduce it (as minimally and precisely as possible): kind config: `cat kind-coco.config kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 nodes:

Use above mentioned config, then kind create cluster --config ./kind-coco.config

Anything else we need to know?:

output of kind export logs --name=coco

1918284630.zip

Environment:

output of kind export logs --name=coco

1918284630.zip

aojea commented 2 years ago

using multiple control-planes implies that kind installs an additional container as load balancer, it seems that is failing with a Retry-After

I0803 10:56:03.610795 120 with_retry.go:241] Got a Retry-After 1s response for attempt 9 to https://coco-external-load-balancer:6443/healthz?timeout=10s I0803 10:56:03.613829 120 round_trippers.go:553] GET https://coco-external-load-balancer:6443/healthz?timeout=10s in 2 milliseconds

it seems this load balancer doesn't work in this setup, I would not be surprised it has to be related to rootless networking , you should check using the same config with rootfull to verify this hypothesis

jensfr commented 2 years ago

I0803 10:56:03.610795 120 with_retry.go:241] Got a Retry-After 1s response for attempt 9 to https://coco-external-load-balancer:6443/healthz?timeout=10s I0803 10:56:03.613829 120 round_trippers.go:553] GET https://coco-external-load-balancer:6443/healthz?timeout=10s in 2 milliseconds

it seems this load balancer doesn't work in this setup, I would not be surprised it has to be related to rootless networking , you should check using the same config with rootfull to verify this hypothesis

Unfortunately it fails the same way rootfull. I did:

BenTheElder commented 2 years ago

time="2022-08-03T13:00:00+02:00" level=error msg="OCI Runtime runc is in use by a container, but is not available (not in configuration file or not installed)"

from podman-info.txt looks interesting 🤔

The log export doesn't seem to have the load balancer container, I wonder if the kind podman provisioner doesn't handle 2 control planes properly or something.

speedyuk commented 2 years ago

I'm not going to be able to add a lot, but just to say I'm getting the same issue. I've corporate provided M1 based Mac. I've installed brew and used that to install Podman and KinD and works OK creating a default one node cluster but fails at the 'Starting control-plane' part when using a cluster config file with more than one master node. Any help solving this issue much appreciated.

BenTheElder commented 2 years ago

A potential workaround is trying docker via lima as an alternative, docker support is more mature and docker is a more stable target.

speedyuk commented 2 years ago

Hi Ben, thanks for that, I had a try but it fell in the too hard bucket. What has worked so far is using RancherDesktop choosing the Docker mode and not enabling the builtin K8s. With that the KinD config with 2 masters and 1 worker completed successfully. I believe Rancher is just using Lima under the hood. I would of preferred to be using containerd but for now if this is a solution I'll try it for a while.

BenTheElder commented 2 years ago

FWIW yes, rancher desktop is based on Lima.

containerd may be possible in the future either via nerdctl more closely matching docker or via writing a kind backend.

I was actually thinking of Colima which is also based on Lima https://github.com/abiosoft/colima, when I last debugged a Lima / alpine related issue https://github.com/abiosoft/colima/issues/291 I used this and found it well documented and quick to setup for docker 😅

BenTheElder commented 1 year ago

re: containerd, #2317

for now I'd recommend lima with the docker option (on an ubuntu or debian base guest environment).

xref: #3277