SovereignCloudStack / cluster-stacks

Definition of Cluster Stacks based on the ClusterAPI ClusterClass feature
https://scs.community/
Apache License 2.0
7 stars 6 forks source link

:sparkles: Add Metal3 provider #82

Closed chess-knight closed 3 weeks ago

chess-knight commented 2 months ago

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged): Fixes #21 Fixes #98

Special notes for your reviewer: Test on virtualized environment: See docs https://book.metal3.io/quick-start

  1. I created Ubuntu instance in gx-scs - flavor SCS-16V-64 and 200GiB disk
  2. Create libvirt network - https://book.metal3.io/quick-start#virtualized-configuration
    $ virsh net-info baremetal
    Name:           baremetal
    UUID:           ae14ef12-4ff1-4c54-90c8-38ebdec3542b
    Active:         yes
    Persistent:     yes
    Autostart:      no
    Bridge:         metal3
  3. If the management cluster(CSO,CAPI,CAPM3,BMO) is outside the "bare-metal" instance(Ironic,libvirt), install Libvirt port-forwarding hook
    • use this hooks.json:
      {
      "bmh-vm-01": {
         "interface": "metal3",
         "private_ip": "192.168.222.150",
         "port_map": {
             "tcp": [
                 6443
             ]
         }
      }
      }
  4. Create VMs - https://book.metal3.io/quick-start#virtualized-configuration
    • e.g. create 1 for control-plane and 3 for workers:
      virt-install \
      --connect qemu:///system \
      --name bmh-vm-01 `# workers 02, 03, 04` \
      --description "Virtualized BareMetalHost" \
      --osinfo=ubuntu-lts-latest \
      --ram=12288 \
      --vcpus=2 `# e.g. 3 vcpus for workers` \
      --disk size=25 `# add second disk (--disk size=20) for workers if you want to install rook-ceph` \
      --graphics=none \
      --console pty \
      --serial pty \
      --pxe \
      --network network=baremetal,mac="00:60:2f:31:81:01" `# workers 02, 03, 04` \
      --noautoconsole
      $ virsh list
      Id   Name        State
      ---------------------------
      1    bmh-vm-01   running
      2    bmh-vm-02   running
      3    bmh-vm-03   running
      4    bmh-vm-04   running
  5. Install sushy-tools for Redfish communication - https://book.metal3.io/quick-start#sushy-tools---aka-the-bmc
    $ docker logs sushy-tools
    * Serving Flask app 'sushy_tools.emulator.main'
    * Debug mode: off
    WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
    * Running on http://192.168.222.1:8000
    Press CTRL+C to quit
  6. Create KinD management cluster - https://book.metal3.io/quick-start#management-cluster
  7. Install Dnsmasq - https://book.metal3.io/quick-start#dhcp-server
    • use this config:
      DHCP_HOSTS=00:60:2f:31:81:01,192.168.222.100;00:60:2f:31:81:02,192.168.222.101;00:60:2f:31:81:03,192.168.222.102;00:60:2f:31:81:04,192.168.222.103
      DHCP_IGNORE=tag:!known
      # IP of the host from VM perspective
      PROVISIONING_IP=192.168.222.1
      GATEWAY_IP=192.168.222.1
      DHCP_RANGE=192.168.222.100,192.168.222.149
      DNS_IP=provisioning
  8. Skip Image server (we use osism images) - https://book.metal3.io/quick-start#image-server
  9. Deploy Ironic - https://book.metal3.io/quick-start#deploy-ironic
    • if ironic should be accessible from outside, add public IP into ironic certificate, e.g.:
    • patch: |-
      • op: replace path: /spec/ipAddresses/0 value: 192.168.222.1
      • op: add path: /spec/ipAddresses/- value: 172.18.0.2
      • op: add path: /spec/ipAddresses/- value: 213.131.230.81 target: kind: Certificate name: ironic-cert|ironic-inspector-cert
  10. Deploy Bare Metal Operator - https://book.metal3.io/quick-start#deploy-bare-metal-operator
    • if ironic is outside of management cluster, modify bmo/ironic.env as follows:
      DEPLOY_KERNEL_URL=http://192.168.222.1:6180/images/ironic-python-agent.kernel
      DEPLOY_RAMDISK_URL=http://192.168.222.1:6180/images/ironic-python-agent.initramfs
      IRONIC_ENDPOINT=https://213.131.230.81:6385/v1/
    • if ironic is outside of management cluster, copy ironic-cacert secret into the management cluster, so BMO can use it(or use IRONIC_INSECURE=True)
  11. Create BareMetalHosts - https://book.metal3.io/quick-start#create-baremetalhosts
    • 1 for control-plane and 3 for workers:
      apiVersion: v1
      kind: Secret
      metadata:
      name: bml-01 # workers 02, 03, 04
      type: Opaque
      stringData:
      username: replaceme
      password: replaceme
      ---
      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
      name: bml-vm-01 # workers 02, 03, 04
      labels:
      type: control-plane # 'type: worker' for workers
      spec:
      online: true
      bootMACAddress: 00:60:2f:31:81:01 # workers 02, 03, 04
      bootMode: legacy
      hardwareProfile: libvirt
      bmc:
      address: redfish-virtualmedia+http://192.168.222.1:8000/redfish/v1/Systems/bmh-vm-01 # workers 02, 03, 04
      credentialsName: bml-01 # workers 02, 03, 04
      $ kubectl get bmh --show-labels
      NAME        STATE       CONSUMER   ONLINE   ERROR   AGE   LABELS
      bml-vm-01   available              true             11m   type=control-plane
      bml-vm-02   available              true             11m   type=worker
      bml-vm-03   available              true             11m   type=worker
      bml-vm-04   available              true             11m   type=worker
  12. Deploy CAPI/CAPM3/CSO
    export CLUSTER_TOPOLOGY=true
    clusterctl init --infrastructure metal3
    # apply Metal3ClusterTemplate CRD until new CAPM3 release (current v1.7.0)
    kubectl apply -f https://raw.githubusercontent.com/metal3-io/cluster-api-provider-metal3/main/config/crd/bases/infrastructure.cluster.x-k8s.io_metal3clustertemplates.yaml
    kubectl label crd metal3clustertemplates.infrastructure.cluster.x-k8s.io cluster.x-k8s.io/v1beta1=v1beta1
    # install CSO in your favourite way
  13. Create Cluster Stack
    apiVersion: clusterstack.x-k8s.io/v1alpha1
    kind: ClusterStack
    metadata:
      name: clusterstack
    spec:
      provider: metal3
      name: alpha
      kubernetesVersion: "1.28"
      channel: custom
      autoSubscribe: false
      noProvider: true
      versions:
      - v0-sha.b699b93
    $ kubectl get clusterstack
    NAME           PROVIDER   CLUSTERSTACK   K8S    CHANNEL   AUTOSUBSCRIBE   USABLE           LATEST                                       AGE   REASON   MESSAGE
    clusterstack   metal3     alpha          1.28   custom    false           v0-sha-b699b93   metal3-alpha-1-28-v0-sha-b699b93 | v1.28.9   12m
  14. Create Cluster

    apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    metadata:
      name: my-cluster
    spec:
      topology:
        class: metal3-alpha-1-28-v0-sha.b699b93
        version: v1.28.9
        controlPlane:
          replicas: 1
        workers:
          machineDeployments:
          - class: default-worker
            name: alpha
            replicas: 3
        variables:
    #   Required
        - name: controlPlaneEndpoint
          value:
            host: 192.168.222.150
    #        host: 213.131.230.81
    #        port: 6443
    #   If .controlPlaneEndpoint.host is public IP, specify also private IP for kube-vip
    #    - name: controlPlaneEndpoint_private_ip
    #      value: 192.168.222.150
    #   Optional
        - name: workerHostSelector
          value:
            matchLabels:
              type: worker
        - name: controlPlaneHostSelector
          value:
            matchLabels:
              type: control-plane
    ##   Experiment with other optional variables, e.g. try rook-ceph
    #    - name: user
    #      value:
    #        name: user
    #        sshKey: ssh-ed25519 ABCD... user@example.com
    #    - name: image
    #      value:
    #        checksum: https://swift.services.a.regiocloud.tech/swift/v1/AUTH_b182637428444b9aa302bb8d5a5a418c/openstack-k8s-capi-images/ubuntu-2204-kube-v1.28/ubuntu-2204-kube-v1.28.10.qcow2.CHECKSUM
    #        checksumType: sha256
    #        format: qcow2
    #        url: https://swift.services.a.regiocloud.tech/swift/v1/AUTH_b182637428444b9aa302bb8d5a5a418c/openstack-k8s-capi-images/ubuntu-2204-kube-v1.28/ubuntu-2204-kube-v1.28.10.qcow2
    #    - name: rook_ceph_cluster_values
    #      value: |
    #        enabled: true
    #    - name: workerDataTemplate
    #      value: my-cluster-workers-template
    #    - name: controlPlaneDataTemplate
    #      value: my-cluster-controlplane-template
    #---
    #apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    #kind: Metal3DataTemplate
    #metadata:
    #  name: my-cluster-controlplane-template
    #spec:
    #  clusterName: my-cluster
    #---
    #apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    #kind: Metal3DataTemplate
    #metadata:
    #  name: my-cluster-workers-template
    #spec:
    #  clusterName: my-cluster
    $ kubectl get cluster,metal3cluster
    NAME                                  CLUSTERCLASS                       PHASE         AGE   VERSION
    cluster.cluster.x-k8s.io/my-cluster   metal3-alpha-1-28-v0-sha.b699b93   Provisioned   62m   v1.28.9
    
    NAME                                                             AGE   READY   ERROR   CLUSTER      ENDPOINT
    metal3cluster.infrastructure.cluster.x-k8s.io/my-cluster-srg2j   62m   true            my-cluster   {"host":"192.168.222.150","port":6443}
    $ clusterctl get kubeconfig my-cluster > kubeconfig.yaml
  15. Test kube-vip service loadbalancing

    $ kubectl --kubeconfig kubeconfig.yaml create deploy --image nginx --port 80 nginx
    # --load-balancer-ip needs to be specified because kube-vip-cloud-provider is missing
    $ kubectl --kubeconfig kubeconfig.yaml expose deployment nginx --port 80 --type LoadBalancer --load-balancer-ip 192.168.222.151
    $ curl 192.168.222.151
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
    html { color-scheme: light dark; }
    body { width: 35em; margin: 0 auto;
    font-family: Tahoma, Verdana, Arial, sans-serif; }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>
    
    <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p>
    
    <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs: