Closed t-lo closed 1 year ago
I'd be interested to have a look on that one. Will try to follow the status of the PR and see what's missing.
So far so good, I managed to deploy a workload cluster based on Flatcar and Ignition provisioning:
$ kubectl --kubeconfig=./capi-quickstart.kubeconfig get nodes
NAME STATUS ROLES AGE VERSION
flatcar-capo-control-plane-48fn9 Ready control-plane,master 10m v1.23.15
flatcar-capo-md-0-9r27m Ready <none> 4m58s v1.23.15
flatcar-capo-md-0-hb695 Ready <none> 2m56s v1.23.15
flatcar-capo-md-0-zmlcx Ready <none> 65s v1.23.15
Using this template (and Ignition bootstrapping):
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
name: ${CLUSTER_NAME}
spec:
clusterNetwork:
pods:
cidrBlocks: ["192.168.0.0/16"] # CIDR block used by Calico.
serviceDomain: "cluster.local"
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackCluster
name: ${CLUSTER_NAME}
controlPlaneRef:
kind: KubeadmControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
name: ${CLUSTER_NAME}-control-plane
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackCluster
metadata:
name: ${CLUSTER_NAME}
spec:
cloudName: ${OPENSTACK_CLOUD}
identityRef:
name: ${CLUSTER_NAME}-cloud-config
kind: Secret
managedSecurityGroups: true
nodeCidr: 10.6.0.0/24
dnsNameservers:
- ${OPENSTACK_DNS_NAMESERVERS}
externalNetworkId: ${OPENSTACK_EXTERNAL_NETWORK_ID}
---
kind: KubeadmControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
metadata:
name: "${CLUSTER_NAME}-control-plane"
spec:
replicas: 1
machineTemplate:
infrastructureRef:
kind: OpenStackMachineTemplate
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
name: "${CLUSTER_NAME}-control-plane"
kubeadmConfigSpec:
format: ignition
ignition:
containerLinuxConfig:
additionalConfig: |
systemd:
units:
- name: kubeadm.service
enabled: true
dropins:
- name: 10-flatcar.conf
contents: |
[Unit]
Requires=containerd.service
After=containerd.service
initConfiguration:
nodeRegistration:
name: '{{ local_hostname }}'
kubeletExtraArgs:
cloud-provider: openstack
cloud-config: /etc/kubernetes/cloud.conf
clusterConfiguration:
imageRepository: k8s.gcr.io
apiServer:
extraArgs:
cloud-provider: openstack
cloud-config: /etc/kubernetes/cloud.conf
extraVolumes:
- name: cloud
hostPath: /etc/kubernetes/cloud.conf
mountPath: /etc/kubernetes/cloud.conf
readOnly: true
controllerManager:
extraArgs:
cloud-provider: openstack
cloud-config: /etc/kubernetes/cloud.conf
extraVolumes:
- name: cloud
hostPath: /etc/kubernetes/cloud.conf
mountPath: /etc/kubernetes/cloud.conf
readOnly: true
- name: cacerts
hostPath: /etc/certs/cacert
mountPath: /etc/certs/cacert
readOnly: true
joinConfiguration:
nodeRegistration:
name: '{{ local_hostname }}'
kubeletExtraArgs:
cloud-config: /etc/kubernetes/cloud.conf
cloud-provider: openstack
files:
- path: /etc/kubernetes/cloud.conf
owner: root
permissions: "0600"
content: ${OPENSTACK_CLOUD_PROVIDER_CONF_B64}
encoding: base64
- path: /etc/certs/cacert
owner: root
permissions: "0600"
content: ${OPENSTACK_CLOUD_CACERT_B64}
encoding: base64
version: "${KUBERNETES_VERSION}"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachineTemplate
metadata:
name: ${CLUSTER_NAME}-control-plane
spec:
template:
spec:
flavor: ${OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR}
image: ${OPENSTACK_IMAGE_NAME}
sshKeyName: ${OPENSTACK_SSH_KEY_NAME}
cloudName: ${OPENSTACK_CLOUD}
identityRef:
name: ${CLUSTER_NAME}-cloud-config
kind: Secret
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: "${CLUSTER_NAME}-md-0"
spec:
clusterName: "${CLUSTER_NAME}"
replicas: ${WORKER_MACHINE_COUNT}
selector:
matchLabels:
template:
spec:
clusterName: "${CLUSTER_NAME}"
version: "${KUBERNETES_VERSION}"
failureDomain: ${OPENSTACK_FAILURE_DOMAIN}
bootstrap:
configRef:
name: "${CLUSTER_NAME}-md-0"
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
infrastructureRef:
name: "${CLUSTER_NAME}-md-0"
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachineTemplate
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachineTemplate
metadata:
name: ${CLUSTER_NAME}-md-0
spec:
template:
spec:
cloudName: ${OPENSTACK_CLOUD}
identityRef:
name: ${CLUSTER_NAME}-cloud-config
kind: Secret
flavor: ${OPENSTACK_NODE_MACHINE_FLAVOR}
image: ${OPENSTACK_IMAGE_NAME}
sshKeyName: ${OPENSTACK_SSH_KEY_NAME}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
name: ${CLUSTER_NAME}-md-0
spec:
template:
spec:
files:
- content: ${OPENSTACK_CLOUD_PROVIDER_CONF_B64}
encoding: base64
owner: root
path: /etc/kubernetes/cloud.conf
permissions: "0600"
- content: ${OPENSTACK_CLOUD_CACERT_B64}
encoding: base64
owner: root
path: /etc/certs/cacert
permissions: "0600"
joinConfiguration:
nodeRegistration:
name: '{{ local_hostname }}'
kubeletExtraArgs:
cloud-config: /etc/kubernetes/cloud.conf
cloud-provider: openstack
format: ignition
ignition:
containerLinuxConfig:
additionalConfig: |
systemd:
units:
- name: kubeadm.service
enabled: true
dropins:
- name: 10-flatcar.conf
contents: |
[Unit]
Requires=containerd.service
After=containerd.service
---
apiVersion: v1
kind: Secret
metadata:
name: ${CLUSTER_NAME}-cloud-config
labels:
clusterctl.cluster.x-k8s.io/move: "true"
data:
clouds.yaml: ${OPENSTACK_CLOUD_YAML_B64}
cacert: ${OPENSTACK_CLOUD_CACERT_B64}
I have two hacks to solve:
Jan 10 10:34:14 flatcar-capo-control-plane-48fn9.novalocal kubeadm.sh[1318]: [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns:v1.8.6: output: time="2023-01-10T10:34:14Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"k8s.gcr.io/coredns:v1.8.6\": failed to resolve reference \"k8s.gcr.io/coredns:v1.8.6\": k8s.gcr.io/coredns:v1.8.6: not found"
currently solved with: sudo ctr --namespace k8s.io images tag registry.k8s.io/coredns/coredns:v1.8.6 k8s.gcr.io/coredns:v1.8.6
. I remember seeing some renaming on the k8s registry side, so maybe CAPO has a hard reference to this?{{ local_hostname }}
is not rendered in the kubeadm config. I don't know yet who's supposed to render it. Good.
You can use clusterConfiguration.imageRepository
directly
Ansible does AFAIK. Did You try to escape it somehow?
We also need to fix image-builder to pass OEM_ID
to ansible (https://github.com/kubernetes-sigs/image-builder/blob/master/images/capi/packer/config/ansible-args.json).
- Ansible does AFAIK. Did You try to escape it somehow?
It's cloud-init, so far we've solved this in a platform specific way in the template for each provider. AWS: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/templates/cluster-template-flatcar.yaml#L43-L78, Vsphere: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/templates/cluster-template-ignition.yaml#L137-L208, Azure: https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/4d9a2933fe1ec3e5dc5b1e8e78e4baf32ca38301/templates/cluster-template-flatcar.yaml#L116-L136.
I actually don't quite remember why this is necessary. It might have been because the system hostname gets set too late for kubeadm on some platforms OR because it's not a fqdn when it should be.
@jepio thanks for sharing the links. I think we can go ahead and do like Azure with preKubeadmCommands
+ metadata service.
Do you use the plain image or https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img.gz?
Management cluster uses https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img.gz and workload cluster uses image produced by the image-builder (make build-qemu-flatcar
+ oem_id=openstack
set as packer user variable)
- Ansible does AFAIK. Did You try to escape it somehow?
It's cloud-init, so far we've solved this in a platform specific way in the template for each provider. AWS: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/templates/cluster-template-flatcar.yaml#L43-L78, Vsphere: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/templates/cluster-template-ignition.yaml#L137-L208, Azure: https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/4d9a2933fe1ec3e5dc5b1e8e78e4baf32ca38301/templates/cluster-template-flatcar.yaml#L116-L136.
I actually don't quite remember why this is necessary. It might have been because the system hostname gets set too late for kubeadm on some platforms OR because it's not a fqdn when it should be.
I think it's just because init configuration and join configuration are handled by Ignition and not by cloud-init. So {{ local_hostname }}
can't be render that's why it takes an extra step.
I am not familiar with Flatcar in detail, but isn't possible to use Afterburn and AFTERBURN_OPENSTACK_HOSTNAME
ENV here?
Here's my notes with the new template (that uses the coreos-metadata service): https://gist.github.com/tormath1/eef833300f2cc8ea79d5ce3bf126f311.
I didn't succeed to make it work with latest kubernetes version (1.26.0) - kubelet was failing with deprecated flags, I can investigate on it but this made me think that we don't have tests yet for Kubernetes 1.26.0 in Flatcar, we should start with this first.
but at least it works fine with 1.23.15 produced by the image-builder. What should we do from now? Add documentation somewhere? I don't know if it's worth to add the template + e2e tests to CAPO if it's being covered by documentation. WDYT?
Thanks. Good. Is there any traction to produce also capi-images ( like https://github.com/osism/k8s-capi-images does for Openstack ) or is it out of the scope? Just asking :)
What should we do from now? Add documentation somewhere?
Regarding images, I wonder if we should add a separate target in image-builder for building Openstack images with whatever required openstack-specific configuration, similarly to other platforms?
Regarding oem_ID, I see this PR: https://github.com/kubernetes-sigs/image-builder/pull/966/files.
Regarding -o HostKeyAlgorithms=+ssh-rsa -o PubkeyAcceptedKeyTypes=+ssh-rsa
, this should no longer be needed since https://github.com/kubernetes-sigs/image-builder/pull/1035 is merged now.
Regarding the templates, I think it would be nice to have them in https://github.com/kubernetes-sigs/cluster-api-provider-openstack/tree/main/templates, with or without e2e tests, depending on what maintainers require.
In general, I was wondering whether it would make sense to have a page in Flatcar documentation about Cluster API, even if just with references to documentation of specific providers. Perhaps that would help discover Flatcar+CAPI combination for users.
Flatcar templates are now available in the CAPO provider. Next steps are defined in this issue on the CAPO side: https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1502.
Closing this.