flatcar / Flatcar

Flatcar project repository for issue tracking, project documentation, etc.
https://www.flatcar.org/
Apache License 2.0
688 stars 30 forks source link

Openstack support #921

Closed t-lo closed 1 year ago

t-lo commented 1 year ago
tormath1 commented 1 year ago

I'd be interested to have a look on that one. Will try to follow the status of the PR and see what's missing.

tormath1 commented 1 year ago

So far so good, I managed to deploy a workload cluster based on Flatcar and Ignition provisioning:

$ kubectl --kubeconfig=./capi-quickstart.kubeconfig get nodes
NAME                               STATUS   ROLES                  AGE     VERSION
flatcar-capo-control-plane-48fn9   Ready    control-plane,master   10m     v1.23.15
flatcar-capo-md-0-9r27m            Ready    <none>                 4m58s   v1.23.15
flatcar-capo-md-0-hb695            Ready    <none>                 2m56s   v1.23.15
flatcar-capo-md-0-zmlcx            Ready    <none>                 65s     v1.23.15

Using this template (and Ignition bootstrapping):

---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: ${CLUSTER_NAME}
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"] # CIDR block used by Calico.
    serviceDomain: "cluster.local"
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
    kind: OpenStackCluster
    name: ${CLUSTER_NAME}
  controlPlaneRef:
    kind: KubeadmControlPlane
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    name: ${CLUSTER_NAME}-control-plane
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackCluster
metadata:
  name: ${CLUSTER_NAME}
spec:
  cloudName: ${OPENSTACK_CLOUD}
  identityRef:
    name: ${CLUSTER_NAME}-cloud-config
    kind: Secret
  managedSecurityGroups: true
  nodeCidr: 10.6.0.0/24
  dnsNameservers:
  - ${OPENSTACK_DNS_NAMESERVERS}
  externalNetworkId: ${OPENSTACK_EXTERNAL_NETWORK_ID}
---
kind: KubeadmControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
metadata:
  name: "${CLUSTER_NAME}-control-plane"
spec:
  replicas: 1
  machineTemplate:
    infrastructureRef:
      kind: OpenStackMachineTemplate
      apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
      name: "${CLUSTER_NAME}-control-plane"
  kubeadmConfigSpec:
    format: ignition
    ignition:
      containerLinuxConfig:
        additionalConfig: |
          systemd:
            units:
            - name: kubeadm.service
              enabled: true
              dropins:
              - name: 10-flatcar.conf
                contents: |
                  [Unit]
                  Requires=containerd.service
                  After=containerd.service
    initConfiguration:
      nodeRegistration:
        name: '{{ local_hostname }}'
        kubeletExtraArgs:
          cloud-provider: openstack
          cloud-config: /etc/kubernetes/cloud.conf
    clusterConfiguration:
      imageRepository: k8s.gcr.io
      apiServer:
        extraArgs:
          cloud-provider: openstack
          cloud-config: /etc/kubernetes/cloud.conf
        extraVolumes:
        - name: cloud
          hostPath: /etc/kubernetes/cloud.conf
          mountPath: /etc/kubernetes/cloud.conf
          readOnly: true
      controllerManager:
        extraArgs:
          cloud-provider: openstack
          cloud-config: /etc/kubernetes/cloud.conf
        extraVolumes:
        - name: cloud
          hostPath: /etc/kubernetes/cloud.conf
          mountPath: /etc/kubernetes/cloud.conf
          readOnly: true
        - name: cacerts
          hostPath: /etc/certs/cacert
          mountPath: /etc/certs/cacert
          readOnly: true
    joinConfiguration:
      nodeRegistration:
        name: '{{ local_hostname }}'
        kubeletExtraArgs:
          cloud-config: /etc/kubernetes/cloud.conf
          cloud-provider: openstack
    files:
    - path: /etc/kubernetes/cloud.conf
      owner: root
      permissions: "0600"
      content: ${OPENSTACK_CLOUD_PROVIDER_CONF_B64}
      encoding: base64
    - path: /etc/certs/cacert
      owner: root
      permissions: "0600"
      content: ${OPENSTACK_CLOUD_CACERT_B64}
      encoding: base64
  version: "${KUBERNETES_VERSION}"
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-control-plane
spec:
  template:
    spec:
      flavor: ${OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR}
      image: ${OPENSTACK_IMAGE_NAME}
      sshKeyName: ${OPENSTACK_SSH_KEY_NAME}
      cloudName: ${OPENSTACK_CLOUD}
      identityRef:
        name: ${CLUSTER_NAME}-cloud-config
        kind: Secret
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: "${CLUSTER_NAME}-md-0"
spec:
  clusterName: "${CLUSTER_NAME}"
  replicas: ${WORKER_MACHINE_COUNT}
  selector:
    matchLabels:
  template:
    spec:
      clusterName: "${CLUSTER_NAME}"
      version: "${KUBERNETES_VERSION}"
      failureDomain: ${OPENSTACK_FAILURE_DOMAIN}
      bootstrap:
        configRef:
          name: "${CLUSTER_NAME}-md-0"
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
      infrastructureRef:
        name: "${CLUSTER_NAME}-md-0"
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
        kind: OpenStackMachineTemplate
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha5
kind: OpenStackMachineTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
spec:
  template:
    spec:
      cloudName: ${OPENSTACK_CLOUD}
      identityRef:
        name: ${CLUSTER_NAME}-cloud-config
        kind: Secret
      flavor: ${OPENSTACK_NODE_MACHINE_FLAVOR}
      image: ${OPENSTACK_IMAGE_NAME}
      sshKeyName: ${OPENSTACK_SSH_KEY_NAME}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: ${CLUSTER_NAME}-md-0
spec:
  template:
    spec:
      files:
      - content: ${OPENSTACK_CLOUD_PROVIDER_CONF_B64}
        encoding: base64
        owner: root
        path: /etc/kubernetes/cloud.conf
        permissions: "0600"
      - content: ${OPENSTACK_CLOUD_CACERT_B64}
        encoding: base64
        owner: root
        path: /etc/certs/cacert
        permissions: "0600"
      joinConfiguration:
        nodeRegistration:
          name: '{{ local_hostname }}'
          kubeletExtraArgs:
            cloud-config: /etc/kubernetes/cloud.conf
            cloud-provider: openstack
      format: ignition
      ignition:
        containerLinuxConfig:
          additionalConfig: |
            systemd:
              units:
              - name: kubeadm.service
                enabled: true
                dropins:
                - name: 10-flatcar.conf
                  contents: |
                    [Unit]
                    Requires=containerd.service
                    After=containerd.service
---
apiVersion: v1
kind: Secret
metadata:
  name: ${CLUSTER_NAME}-cloud-config
  labels:
    clusterctl.cluster.x-k8s.io/move: "true"
data:
  clouds.yaml: ${OPENSTACK_CLOUD_YAML_B64}
  cacert: ${OPENSTACK_CLOUD_CACERT_B64}

I have two hacks to solve:

  1. Jan 10 10:34:14 flatcar-capo-control-plane-48fn9.novalocal kubeadm.sh[1318]: [ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns:v1.8.6: output: time="2023-01-10T10:34:14Z" level=fatal msg="pulling image: rpc error: code = NotFound desc = failed to pull and unpack image \"k8s.gcr.io/coredns:v1.8.6\": failed to resolve reference \"k8s.gcr.io/coredns:v1.8.6\": k8s.gcr.io/coredns:v1.8.6: not found" currently solved with: sudo ctr --namespace k8s.io images tag registry.k8s.io/coredns/coredns:v1.8.6 k8s.gcr.io/coredns:v1.8.6. I remember seeing some renaming on the k8s registry side, so maybe CAPO has a hard reference to this?
  2. {{ local_hostname }} is not rendered in the kubeadm config. I don't know yet who's supposed to render it.
lukasmrtvy commented 1 year ago

Good.

  1. You can use clusterConfiguration.imageRepository directly

  2. Ansible does AFAIK. Did You try to escape it somehow?

jepio commented 1 year ago

We also need to fix image-builder to pass OEM_ID to ansible (https://github.com/kubernetes-sigs/image-builder/blob/master/images/capi/packer/config/ansible-args.json).

  1. Ansible does AFAIK. Did You try to escape it somehow?

It's cloud-init, so far we've solved this in a platform specific way in the template for each provider. AWS: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/templates/cluster-template-flatcar.yaml#L43-L78, Vsphere: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/templates/cluster-template-ignition.yaml#L137-L208, Azure: https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/4d9a2933fe1ec3e5dc5b1e8e78e4baf32ca38301/templates/cluster-template-flatcar.yaml#L116-L136.

I actually don't quite remember why this is necessary. It might have been because the system hostname gets set too late for kubeadm on some platforms OR because it's not a fqdn when it should be.

tormath1 commented 1 year ago

@jepio thanks for sharing the links. I think we can go ahead and do like Azure with preKubeadmCommands + metadata service.

pothos commented 1 year ago

Do you use the plain image or https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img.gz?

tormath1 commented 1 year ago

Management cluster uses https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img.gz and workload cluster uses image produced by the image-builder (make build-qemu-flatcar + oem_id=openstack set as packer user variable)

tormath1 commented 1 year ago
  1. Ansible does AFAIK. Did You try to escape it somehow?

It's cloud-init, so far we've solved this in a platform specific way in the template for each provider. AWS: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/templates/cluster-template-flatcar.yaml#L43-L78, Vsphere: https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/templates/cluster-template-ignition.yaml#L137-L208, Azure: https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/4d9a2933fe1ec3e5dc5b1e8e78e4baf32ca38301/templates/cluster-template-flatcar.yaml#L116-L136.

I actually don't quite remember why this is necessary. It might have been because the system hostname gets set too late for kubeadm on some platforms OR because it's not a fqdn when it should be.

I think it's just because init configuration and join configuration are handled by Ignition and not by cloud-init. So {{ local_hostname }} can't be render that's why it takes an extra step.

lukasmrtvy commented 1 year ago

I am not familiar with Flatcar in detail, but isn't possible to use Afterburn and AFTERBURN_OPENSTACK_HOSTNAME ENV here?

pothos commented 1 year ago

We have something to set it: https://github.com/flatcar/bootengine/blob/9b63fc4e01c1f615c2e24c5887736df3b51279b8/dracut/30ignition/flatcar-openstack-hostname.service

tormath1 commented 1 year ago

Here's my notes with the new template (that uses the coreos-metadata service): https://gist.github.com/tormath1/eef833300f2cc8ea79d5ce3bf126f311.

I didn't succeed to make it work with latest kubernetes version (1.26.0) - kubelet was failing with deprecated flags, I can investigate on it but this made me think that we don't have tests yet for Kubernetes 1.26.0 in Flatcar, we should start with this first.

but at least it works fine with 1.23.15 produced by the image-builder. What should we do from now? Add documentation somewhere? I don't know if it's worth to add the template + e2e tests to CAPO if it's being covered by documentation. WDYT?

lukasmrtvy commented 1 year ago

Thanks. Good. Is there any traction to produce also capi-images ( like https://github.com/osism/k8s-capi-images does for Openstack ) or is it out of the scope? Just asking :)

invidian commented 1 year ago

What should we do from now? Add documentation somewhere?

Regarding images, I wonder if we should add a separate target in image-builder for building Openstack images with whatever required openstack-specific configuration, similarly to other platforms?

Regarding oem_ID, I see this PR: https://github.com/kubernetes-sigs/image-builder/pull/966/files.

Regarding -o HostKeyAlgorithms=+ssh-rsa -o PubkeyAcceptedKeyTypes=+ssh-rsa, this should no longer be needed since https://github.com/kubernetes-sigs/image-builder/pull/1035 is merged now.

Regarding the templates, I think it would be nice to have them in https://github.com/kubernetes-sigs/cluster-api-provider-openstack/tree/main/templates, with or without e2e tests, depending on what maintainers require.

In general, I was wondering whether it would make sense to have a page in Flatcar documentation about Cluster API, even if just with references to documentation of specific providers. Perhaps that would help discover Flatcar+CAPI combination for users.

tormath1 commented 1 year ago

Flatcar templates are now available in the CAPO provider. Next steps are defined in this issue on the CAPO side: https://github.com/kubernetes-sigs/cluster-api-provider-openstack/issues/1502.

Closing this.