kubernetes-sigs / cluster-api-provider-openstack

Cluster API implementation for OpenStack
https://cluster-api-openstack.sigs.k8s.io/
Apache License 2.0
275 stars 253 forks source link

Control plane node is up, but worker node is stuck in pending state in openstack. #2126

Open andresache opened 2 weeks ago

andresache commented 2 weeks ago

/kind bug

Hello,

Has anyone faced a similar issue before or can guide me in the right direction here?

I am trying to run cluster-api-bootstrap-provider-microk8s solution to create my k8s cluster in openstack: https://github.com/canonical/cluster-api-bootstrap-provider-microk8s?tab=readme-ov-file

Cluster and control plane node are being successfully created, but the worker node is getting stuck in pending state.

Cluster being created: image

Control plane and worker nodes: image

These are my capi pods that are running in my local management microk8s cluster: image

And here are some logs of the capo-controller-manager-7c468b6c46-j8xrp pod:

│ I0612 22:01:54.068515       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="O │
│ I0612 22:01:54.576844       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controller │
│ I0612 22:01:54.576943       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKi │
│ I0612 22:01:55.535139       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachin │
│ I0612 22:01:57.041703       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="Op │
│ I0612 22:01:57.041981       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="O │
│ I0612 22:01:59.494677       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controller │
│ I0612 22:01:59.494791       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKi │
│ I0612 22:05:10.818129       1 openstackmachine_controller.go:550] "Bootstrap data secret reference is not yet available" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io │
│ I0612 22:05:10.819972       1 openstackcluster_controller.go:348] "Reconciling Cluster" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackCluste │
│ I0612 22:05:10.819989       1 openstackcluster_controller.go:686] "Reconciling network components" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="Open │
│ I0612 22:05:10.819986       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachin │
│ I0612 22:05:11.101521       1 network.go:119] "External network found" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackCluster" OpenStackClust │
│ I0612 22:05:11.101904       1 network.go:125] "Reconciling network" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackCluster" OpenStackCluster= │
│ I0612 22:05:11.105085       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="Op │
│ I0612 22:05:11.105108       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="O │
│ I0612 22:05:11.408652       1 network.go:205] "Reconciling subnet" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackCluster" OpenStackCluster=" │
│ I0612 22:05:11.613840       1 router.go:49] "Reconciling router" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackCluster" OpenStackCluster="de │
│ I0612 22:05:11.720612       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controller │
│ I0612 22:05:11.720713       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKi │
│ I0612 22:05:12.034103       1 securitygroups.go:44] "Reconciling security groups" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackCluster" Ope │
│ I0612 22:05:12.229131       1 openstackcluster_controller.go:397] "Reconciled Cluster created successfully" controller="openstackcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerK │

And here are some logs of the capi-controller-manager-5d79cb94cf-qz8nr pod:

I0612 22:29:50.416964       1 machine_controller_phases.go:236] "Waiting for bootstrap provider to generate data secret and report status.ready" controller="machine" controllerGroup="cluster.x-k8s.io" controllerKind="Machine" Machine="default/microk8s-openstack-md-0-fk4pl-2crwc" namespace="default" name="microk8s-openstack-md-0-fk4pl-2crwc" reconcileID="7898ea02-949f-4f20-8b51-cd381dc84051" MachineSet="default/microk8s-openstack-md-0-fk4pl" MachineDeployment="default/microk8s-openstack-md-0" Cluster="default/microk8s-openstack" MicroK8sConfig="default/microk8s-openstack-md-0-fk4pl-2crwc"
I0612 22:29:50.417309       1 machine_controller_phases.go:306] "Waiting for infrastructure provider to create machine infrastructure and report status.ready" controller="machine" controllerGroup="cluster.x-k8s.io" controllerKind="Machine" Machine="default/microk8s-openstack-md-0-fk4pl-2crwc" namespace="default" name="microk8s-openstack-md-0-fk4pl-2crwc" reconcileID="7898ea02-949f-4f20-8b51-cd381dc84051" MachineSet="default/microk8s-openstack-md-0-fk4pl" MachineDeployment="default/microk8s-openstack-md-0" Cluster="default/microk8s-openstack" OpenStackMachine="default/microk8s-openstack-md-0-fk4pl-2crwc"
I0612 22:29:50.417325       1 machine_controller_noderef.go:60] "Waiting for infrastructure provider to report spec.providerID" controller="machine" controllerGroup="cluster.x-k8s.io" controllerKind="Machine" Machine="default/microk8s-openstack-md-0-fk4pl-2crwc" namespace="default" name="microk8s-openstack-md-0-fk4pl-2crwc" reconcileID="7898ea02-949f-4f20-8b51-cd381dc84051" MachineSet="default/microk8s-openstack-md-0-fk4pl" MachineDeployment="default/microk8s-openstack-md-0" Cluster="default/microk8s-openstack" OpenStackMachine="default/microk8s-openstack-md-0-fk4pl-2crwc"
E0612 22:29:52.564694       1 controller.go:329] "Reconciler error" err="failed to create cluster accessor: error creating http client and mapper for remote cluster \"default/microk8s-openstack\": error creating client for remote cluster \"default/microk8s-openstack\": error getting rest mapping: failed to get API group resources: unable to retrieve the complete list of server APIs: v1: Get \"https://10.8.8.194:6443/api/v1?timeout=10s\": dial tcp 10.8.8.194:6443: connect: connection refused" controller="machine" controllerGroup="cluster.x-k8s.io" controllerKind="Machine" Machine="default/microk8s-openstack-control-plane-6mgf5" namespace="default" name="microk8s-openstack-control-plane-6mgf5" reconcileID="4735793b-d189-4006-bffa-01a9bdbf5f9e"

Let me know if any other additional informations are needed. Any help would be much appreciated.

EmilienM commented 2 weeks ago

Hi @andresache Thanks for the bug report. Just for information, next time please use proper markdown so it's easily readable (`` vs).

For the bug itself, you should file it against microk8s, not here. I don't see any error in CAPO however I can see failures in microk8s side. If you want to provide more informations, please provide the full describe of the OpenStackMachine and OpenStackCluster objects.

Thanks

andresache commented 2 weeks ago

Hi @EmilienM,

Thank you for your response! I have taken a note of your suggestion 👍

Below are the informations:

cluster 'microk8s-openstack':

andres@andres-VirtualBox:~/cluster-api-bootstrap-provider-microk8s$ kubectl describe cluster microk8s-openstack
Name:         microk8s-openstack
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  cluster.x-k8s.io/v1beta1
Kind:         Cluster
Metadata:
  Creation Timestamp:  2024-06-13T13:02:55Z
  Finalizers:
    cluster.cluster.x-k8s.io
  Generation:        3
  Resource Version:  2835
  UID:               9ec0c899-40dd-4dbc-a287-197f2da46f9c
Spec:
  Control Plane Endpoint:
    Host:  10.8.8.225
    Port:  6443
  Control Plane Ref:
    API Version:  controlplane.cluster.x-k8s.io/v1beta1
    Kind:         MicroK8sControlPlane
    Name:         microk8s-openstack-control-plane
    Namespace:    default
  Infrastructure Ref:
    API Version:  infrastructure.cluster.x-k8s.io/v1beta1
    Kind:         OpenStackCluster
    Name:         microk8s-openstack
    Namespace:    default
Status:
  Conditions:
    Last Transition Time:  2024-06-13T13:03:53Z
    Reason:                WaitingForMicroK8sBoot
    Severity:              Info
    Status:                False
    Type:                  Ready
    Last Transition Time:  2024-06-13T13:02:56Z
    Message:               Waiting for control plane provider to indicate the control plane has been initialized
    Reason:                WaitingForControlPlaneProviderInitialized
    Severity:              Info
    Status:                False
    Type:                  ControlPlaneInitialized
    Last Transition Time:  2024-06-13T13:03:53Z
    Reason:                WaitingForMicroK8sBoot
    Severity:              Info
    Status:                False
    Type:                  ControlPlaneReady
    Last Transition Time:  2024-06-13T13:03:14Z
    Status:                True
    Type:                  InfrastructureReady
  Failure Domains:
    Nova:
      Control Plane:     true
  Infrastructure Ready:  true
  Observed Generation:   3
  Phase:                 Provisioned
Events:                  <none>

microk8s-openstack-control-plane:

andres@andres-VirtualBox:~/cluster-api-bootstrap-provider-microk8s$ kubectl describe machine microk8s-openstack-control-plane
Name:         microk8s-openstack-control-plane-n9ncb
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=microk8s-openstack
              cluster.x-k8s.io/control-plane=
Annotations:  <none>
API Version:  cluster.x-k8s.io/v1beta1
Kind:         Machine
Metadata:
  Creation Timestamp:  2024-06-13T13:03:14Z
  Finalizers:
    machine.cluster.x-k8s.io
  Generation:  4
  Owner References:
    API Version:           controlplane.cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  MicroK8sControlPlane
    Name:                  microk8s-openstack-control-plane
    UID:                   a33c2610-ddc1-429f-bf90-c749aaef7246
  Resource Version:        2862
  UID:                     0c69643f-edb7-48c9-b482-1f5f31dd7bd5
Spec:
  Bootstrap:
    Config Ref:
      API Version:     bootstrap.cluster.x-k8s.io/v1beta1
      Kind:            MicroK8sConfig
      Name:            microk8s-openstack-control-plane-mzlrc
      Namespace:       default
      UID:             51b826b0-52cc-4e9b-aa1e-269e6f247d86
    Data Secret Name:  microk8s-openstack-control-plane-mzlrc
  Cluster Name:        microk8s-openstack
  Failure Domain:      nova
  Infrastructure Ref:
    API Version:          infrastructure.cluster.x-k8s.io/v1beta1
    Kind:                 OpenStackMachine
    Name:                 microk8s-openstack-control-plane-lmq7p
    Namespace:            default
    UID:                  e2d0e7b2-c283-489c-bb38-a042b6299cfd
  Node Deletion Timeout:  10s
  Provider ID:            openstack:///ac42c664-8934-4bd5-bd3c-373d66ca4acd
  Version:                v1.25.0
Status:
  Addresses:
    Address:        192.168.99.218
    Type:           InternalIP
    Address:        10.8.8.225
    Type:           ExternalIP
    Address:        microk8s-openstack-control-plane-lmq7p
    Type:           InternalDNS
  Bootstrap Ready:  true
  Conditions:
    Last Transition Time:  2024-06-13T13:03:50Z
    Status:                True
    Type:                  Ready
    Last Transition Time:  2024-06-13T13:03:16Z
    Status:                True
    Type:                  BootstrapReady
    Last Transition Time:  2024-06-13T13:03:50Z
    Status:                True
    Type:                  InfrastructureReady
    Last Transition Time:  2024-06-13T13:03:15Z
    Reason:                WaitingForNodeRef
    Severity:              Info
    Status:                False
    Type:                  NodeHealthy
  Infrastructure Ready:    true
  Last Updated:            2024-06-13T13:03:54Z
  Observed Generation:     3
  Phase:                   Provisioned
Events:                    <none>

microk8s-openstack-md-0

andres@andres-VirtualBox:~/cluster-api-bootstrap-provider-microk8s$ kubectl describe machine microk8s-openstack-md-0
Name:         microk8s-openstack-md-0-559xz-qdzf9
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=microk8s-openstack
              cluster.x-k8s.io/deployment-name=microk8s-openstack-md-0
              cluster.x-k8s.io/set-name=microk8s-openstack-md-0-559xz
              machine-template-hash=2083806388-559xz
Annotations:  <none>
API Version:  cluster.x-k8s.io/v1beta1
Kind:         Machine
Metadata:
  Creation Timestamp:  2024-06-13T13:02:56Z
  Finalizers:
    machine.cluster.x-k8s.io
  Generation:  1
  Owner References:
    API Version:           cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  MachineSet
    Name:                  microk8s-openstack-md-0-559xz
    UID:                   f7bb6968-91f4-4480-bf1b-1d1358baeba6
  Resource Version:        2577
  UID:                     79a543dc-5fde-4292-8303-e7050c2329c5
Spec:
  Bootstrap:
    Config Ref:
      API Version:  bootstrap.cluster.x-k8s.io/v1beta1
      Kind:         MicroK8sConfig
      Name:         microk8s-openstack-md-0-559xz-qdzf9
      Namespace:    default
      UID:          8c189909-d16c-4b97-8893-7dd0b198e1da
  Cluster Name:     microk8s-openstack
  Failure Domain:   nova
  Infrastructure Ref:
    API Version:          infrastructure.cluster.x-k8s.io/v1beta1
    Kind:                 OpenStackMachine
    Name:                 microk8s-openstack-md-0-559xz-qdzf9
    Namespace:            default
    UID:                  40bddca2-5fb3-48cd-a71f-521645ed9e28
  Node Deletion Timeout:  10s
  Version:                v1.25.0
Status:
  Conditions:
    Last Transition Time:  2024-06-13T13:03:14Z
    Message:               0 of 2 completed
    Reason:                WaitingForBootstrapData
    Severity:              Info
    Status:                False
    Type:                  Ready
    Last Transition Time:  2024-06-13T13:03:14Z
    Reason:                WaitingForControlPlaneAvailable
    Severity:              Info
    Status:                False
    Type:                  BootstrapReady
    Last Transition Time:  2024-06-13T13:03:14Z
    Reason:                WaitingForBootstrapData
    Severity:              Info
    Status:                False
    Type:                  InfrastructureReady
    Last Transition Time:  2024-06-13T13:02:56Z
    Reason:                WaitingForNodeRef
    Severity:              Info
    Status:                False
    Type:                  NodeHealthy
  Last Updated:            2024-06-13T13:02:56Z
  Observed Generation:     1
  Phase:                   Pending
Events:                    <none>

And below is the manifest I'm applying in my local management cluster to create the cluster in openstack:

andres@andres-VirtualBox:~/cluster-api-bootstrap-provider-microk8s$ cat cluster-openstack.yaml 
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: microk8s-openstack
  namespace: default
spec:
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: MicroK8sControlPlane
    name: microk8s-openstack-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
    kind: OpenStackCluster
    name: microk8s-openstack
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackCluster
metadata:
  name: microk8s-openstack
  namespace: default
spec:
  apiServerLoadBalancer:
    enabled: false
  cloudName: openstack
  disablePortSecurity: true
  dnsNameservers:
  - 10.8.8.1
  - 193.226.5.151
  - 8.8.8.8
  externalNetworkId: ""
  identityRef:
    kind: Secret
    name: cloud-config
  nodeCidr: 192.168.99.0/24
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: MicroK8sControlPlane
metadata:
  name: microk8s-openstack-control-plane
  namespace: default
spec:
  controlPlaneConfig:
    clusterConfiguration:
      portCompatibilityRemap: true
    initConfiguration:
      addons:
      - dns
      - ingress
      confinement: classic
      httpProxy: ""
      httpsProxy: ""
      joinTokenTTLInSecs: 900000
      noProxy: ""
      riskLevel: stable
  machineTemplate:
    infrastructureTemplate:
      apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
      kind: OpenStackMachineTemplate
      name: microk8s-openstack-control-plane
  replicas: 1
  upgradeStrategy: SmartUpgrade
  version: v1.25.0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackMachineTemplate
metadata:
  name: microk8s-openstack-control-plane
  namespace: default
spec:
  template:
    spec:
      cloudName: openstack
      flavor: kube_nodes
      identityRef:
        kind: Secret
        name: cloud-config
      image: ubuntu20
      sshKeyName: ssh-key
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: microk8s-openstack-md-0
  namespace: default
spec:
  clusterName: microk8s-openstack
  replicas: 1
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: MicroK8sConfigTemplate
          name: microk8s-openstack-md-0
      clusterName: microk8s-openstack
      failureDomain: nova
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
        kind: OpenStackMachineTemplate
        name: microk8s-openstack-md-0
      version: 1.25.0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha7
kind: OpenStackMachineTemplate
metadata:
  name: microk8s-openstack-md-0
  namespace: default
spec:
  template:
    spec:
      cloudName: openstack
      flavor: kube_nodes
      identityRef:
        kind: Secret
        name: cloud-config
      image: ubuntu20
      sshKeyName: ssh-key
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: MicroK8sConfigTemplate
metadata:
  name: microk8s-openstack-md-0
  namespace: default
spec:
  template:
    spec:
      clusterConfiguration:
        portCompatibilityRemap: true
      initConfiguration:
        confinement: classic
        httpProxy: ""
        httpsProxy: ""
        noProxy: ""
        riskLevel: stable

Let me know if you need any other information.

Thanks!

EmilienM commented 2 weeks ago

Please share the describe of OpenStackCluster and the OpenStackMachines.

andresache commented 2 weeks ago

I think this is what you were expecting:

OpenStackCluster describe:

andres@andres-VirtualBox:~/cluster-api-bootstrap-provider-microk8s$ kubectl describe openstackcluster microk8s-openstack
Name:         microk8s-openstack-control-plane-lmq7p
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=microk8s-openstack
              cluster.x-k8s.io/control-plane=
Annotations:  cluster.x-k8s.io/cloned-from-groupkind: OpenStackMachineTemplate.infrastructure.cluster.x-k8s.io
              cluster.x-k8s.io/cloned-from-name: microk8s-openstack-control-plane
              cluster.x-k8s.io/conversion-data:
                {"spec":{"h":"Hp4mVmnMPrI=","d":{"cloudName":"openstack","flavor":"kube_nodes","image":"ubuntu20","sshKeyName":"ssh-key","identityRef":{"k...
API Version:  infrastructure.cluster.x-k8s.io/v1beta1
Kind:         OpenStackMachine
Metadata:
  Creation Timestamp:  2024-06-13T13:03:14Z
  Finalizers:
    openstackmachine.infrastructure.cluster.x-k8s.io
  Generation:  2
  Owner References:
    API Version:           cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Machine
    Name:                  microk8s-openstack-control-plane-n9ncb
    UID:                   0c69643f-edb7-48c9-b482-1f5f31dd7bd5
  Resource Version:        2838
  UID:                     e2d0e7b2-c283-489c-bb38-a042b6299cfd
Spec:
  Flavor:  kube_nodes
  Identity Ref:
    Cloud Name:  openstack
    Name:        cloud-config
  Image:
    Filter:
      Name:      ubuntu20
  Provider ID:   openstack:///ac42c664-8934-4bd5-bd3c-373d66ca4acd
  Ssh Key Name:  ssh-key
Status:
  Addresses:
    Address:  192.168.99.218
    Type:     InternalIP
    Address:  10.8.8.225
    Type:     ExternalIP
    Address:  microk8s-openstack-control-plane-lmq7p
    Type:     InternalDNS
  Conditions:
    Last Transition Time:  2024-06-13T13:03:50Z
    Status:                True
    Type:                  Ready
    Last Transition Time:  2024-06-13T13:03:50Z
    Status:                True
    Type:                  APIServerIngressReadyCondition
    Last Transition Time:  2024-06-13T13:03:50Z
    Status:                True
    Type:                  InstanceReady
  Instance ID:             ac42c664-8934-4bd5-bd3c-373d66ca4acd
  Instance State:          ACTIVE
  Ready:                   true
  Resolved:
    Image ID:  4defaaa9-bfa2-4f86-92dd-a83fc38e8101
    Ports:
      Description:  Created by cluster-api-provider-openstack cluster default-microk8s-openstack
      Fixed I Ps:
        Subnet:    a6600895-9ef9-4b94-9c0d-bed2ae23d1c1
      Name:        microk8s-openstack-control-plane-lmq7p-0
      Network ID:  7dd92989-2832-4037-b8c9-8848b9317f15
  Resources:
    Ports:
      Id:  615302bf-e2ac-44e7-ac84-58aacae1a588
Events:    <none>

Control plane node describe (OpenStackMachine):

andres@andres-VirtualBox:~/cluster-api-bootstrap-provider-microk8s$ kubectl describe openstackmachine microk8s-openstack-md-0 -n default
Name:         microk8s-openstack-md-0-559xz-qdzf9
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=microk8s-openstack
              cluster.x-k8s.io/deployment-name=microk8s-openstack-md-0
              cluster.x-k8s.io/set-name=microk8s-openstack-md-0-559xz
              machine-template-hash=2083806388-559xz
Annotations:  cluster.x-k8s.io/cloned-from-groupkind: OpenStackMachineTemplate.infrastructure.cluster.x-k8s.io
              cluster.x-k8s.io/cloned-from-name: microk8s-openstack-md-0
API Version:  infrastructure.cluster.x-k8s.io/v1beta1
Kind:         OpenStackMachine
Metadata:
  Creation Timestamp:  2024-06-13T13:02:56Z
  Generation:          1
  Owner References:
    API Version:           cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Machine
    Name:                  microk8s-openstack-md-0-559xz-qdzf9
    UID:                   79a543dc-5fde-4292-8303-e7050c2329c5
  Resource Version:        2564
  UID:                     40bddca2-5fb3-48cd-a71f-521645ed9e28
Spec:
  Flavor:  kube_nodes
  Identity Ref:
    Cloud Name:  openstack
    Name:        cloud-config
  Image:
    Filter:
      Name:      ubuntu20
  Ssh Key Name:  ssh-key
Status:
  Conditions:
    Last Transition Time:  2024-06-13T13:03:14Z
    Reason:                WaitingForBootstrapData
    Severity:              Info
    Status:                False
    Type:                  Ready
    Last Transition Time:  2024-06-13T13:03:14Z
    Reason:                WaitingForBootstrapData
    Severity:              Info
    Status:                False
    Type:                  InstanceReady
Events:                    <none>

Worker node describe (OpenStackMachine):

andres@andres-VirtualBox:~/cluster-api-bootstrap-provider-microk8s$ kubectl describe openstackmachine microk8s-openstack-md-0 -n default
Name:         microk8s-openstack-md-0-559xz-qdzf9
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=microk8s-openstack
              cluster.x-k8s.io/deployment-name=microk8s-openstack-md-0
              cluster.x-k8s.io/set-name=microk8s-openstack-md-0-559xz
              machine-template-hash=2083806388-559xz
Annotations:  cluster.x-k8s.io/cloned-from-groupkind: OpenStackMachineTemplate.infrastructure.cluster.x-k8s.io
              cluster.x-k8s.io/cloned-from-name: microk8s-openstack-md-0
API Version:  infrastructure.cluster.x-k8s.io/v1beta1
Kind:         OpenStackMachine
Metadata:
  Creation Timestamp:  2024-06-13T13:02:56Z
  Generation:          1
  Owner References:
    API Version:           cluster.x-k8s.io/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Machine
    Name:                  microk8s-openstack-md-0-559xz-qdzf9
    UID:                   79a543dc-5fde-4292-8303-e7050c2329c5
  Resource Version:        2564
  UID:                     40bddca2-5fb3-48cd-a71f-521645ed9e28
Spec:
  Flavor:  kube_nodes
  Identity Ref:
    Cloud Name:  openstack
    Name:        cloud-config
  Image:
    Filter:
      Name:      ubuntu20
  Ssh Key Name:  ssh-key
Status:
  Conditions:
    Last Transition Time:  2024-06-13T13:03:14Z
    Reason:                WaitingForBootstrapData
    Severity:              Info
    Status:                False
    Type:                  Ready
    Last Transition Time:  2024-06-13T13:03:14Z
    Reason:                WaitingForBootstrapData
    Severity:              Info
    Status:                False
    Type:                  InstanceReady
Events:                    <none>

Let me know if this is right. Thanks!

EmilienM commented 2 weeks ago

I don't see much Status in the OpenStackMachine. Please share the CAPO manager logs. I'm nearly convinced something's fishy on the microk8s side of things though.

andresache commented 2 weeks ago

Here are the logs of the CAPO manager:

capo-system/capo-controller-manager-7c468b6c46-txxhb

I0613 16:00:22.212292       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-p │
│ lane-zzflv" namespace="default" name="microk8s-openstack-control-plane-zzflv" reconcileID="8807afaf-5916-4905-9b88-6335514f1012" openStackMachine="microk8s-openstack-control-plane-zzflv" machine="microk8s-openstack-control-plane-mhgrm" cluster="microk8s-op │
│ enstack" openStackCluster="microk8s-openstack"                                                                                                                                                                                                                   │
│ I0613 16:00:22.299362       1 openstackmachine_controller.go:550] "Bootstrap data secret reference is not yet available" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="def │
│ ault/microk8s-openstack-md-0-szf6n-rvc5z" namespace="default" name="microk8s-openstack-md-0-szf6n-rvc5z" reconcileID="7f0cd109-16cf-4896-b476-e70675b75483" openStackMachine="microk8s-openstack-md-0-szf6n-rvc5z" machine="microk8s-openstack-md-0-szf6n-rvc5z" │
│  cluster="microk8s-openstack" openStackCluster="microk8s-openstack"                                                                                                                                                                                              │
│ I0613 16:00:23.128988       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openst │
│ ack-control-plane-zzflv" namespace="default" name="microk8s-openstack-control-plane-zzflv" reconcileID="8807afaf-5916-4905-9b88-6335514f1012" openStackMachine="microk8s-openstack-control-plane-zzflv" machine="microk8s-openstack-control-plane-mhgrm" cluster │
│ ="microk8s-openstack" openStackCluster="microk8s-openstack" id="36280124-5978-4771-b727-aac325540ee2"                                                                                                                                                            │
│ I0613 16:00:23.129034       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-opens │
│ tack-control-plane-zzflv" namespace="default" name="microk8s-openstack-control-plane-zzflv" reconcileID="8807afaf-5916-4905-9b88-6335514f1012" openStackMachine="microk8s-openstack-control-plane-zzflv" machine="microk8s-openstack-control-plane-mhgrm" cluste │
│ r="microk8s-openstack" openStackCluster="microk8s-openstack"                                                                                                                                                                                                     │
│ I0613 16:00:23.744634       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8 │
│ s-openstack-control-plane-zzflv" namespace="default" name="microk8s-openstack-control-plane-zzflv" reconcileID="8807afaf-5916-4905-9b88-6335514f1012" openStackMachine="microk8s-openstack-control-plane-zzflv" machine="microk8s-openstack-control-plane-mhgrm" │
│  cluster="microk8s-openstack" openStackCluster="microk8s-openstack" id="237f31f3-17a4-4c51-9a98-1360878aa62b" fixedIP="192.168.99.116" portID="5ed0f8e1-7514-488e-bbfd-1e545e6b95cf"                                                                             │
│ I0613 16:00:23.744746       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ openstack-control-plane-zzflv" namespace="default" name="microk8s-openstack-control-plane-zzflv" reconcileID="8807afaf-5916-4905-9b88-6335514f1012" openStackMachine="microk8s-openstack-control-plane-zzflv" machine="microk8s-openstack-control-plane-mhgrm" c │
│ luster="microk8s-openstack" openStackCluster="microk8s-openstack" 

And one more thing I've noticed in the capi microk8s manager logs: capi-microk8s-bootstrap-controller-manager-64cb5cb474-4bc6n

I0613 15:58:39.675202       1 microk8sconfig_controller.go:218] "Cluster control plane is not initialized, waiting" controller="microk8sconfig" controllerGroup="bootstrap.cluster.x-k8s.io" controllerKind="MicroK8sConfig" MicroK8sConfig="default/microk8s-op │
│ enstack-md-0-szf6n-rvc5z" namespace="default" name="microk8s-openstack-md-0-szf6n-rvc5z" reconcileID=58cae8ae-35fd-47ce-9558-dd4dfbeec21b kind="Machine" version="19505" name="microk8s-openstack-md-0-szf6n-rvc5z

It is stating that the control plane is not yet initialized even though the control plane is up.

andresache commented 2 weeks ago

What I'm actually trying is to create an autoscaler for my cluster in openstack.

There is no support for openstack horizon in the autoscaler project: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider, therefore I was thinking to use the cluster api to achieve this https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/clusterapi.

I'm not sure if I'm going in the right direction or if there is a simpler way to achieve this. Any guidance or advice would be much appreciated.

andresache commented 2 weeks ago

I have also tried to increase the number of control plane nodes and worker nodes to 2 and it looks like octopus is able to provision only one instance, therefore the issue might not be with the worker node.

image

And these are the logs of the CAPO manager:

I0613 19:26:38.899729       1 floatingip.go:172] "Associating floating IP" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-plane-5hwbj" n │
│ I0613 19:26:41.869361       1 floatingip.go:230] "Waiting for floating IP" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-plane-5hwbj" n │
│ I0613 19:26:45.580397       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ I0613 19:26:45.585086       1 recorder.go:104] "Associated floating IP 10.8.8.229 with port 23df85c9-1078-4137-93f1-3c7e07375590" logger="events" type="Normal" object={"kind":"OpenStackMachine","namespace":"default","name":"microk8s-openstack-control-plane │
│ I0613 19:26:45.703524       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-p │
│ I0613 19:26:46.066160       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openst │
│ I0613 19:26:46.066239       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-opens │
│ I0613 19:26:46.782624       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8 │
│ I0613 19:26:46.782655       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ I0613 19:26:46.816587       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-p │
│ I0613 19:26:47.191300       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openst │
│ I0613 19:26:47.191388       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-opens │
│ I0613 19:26:47.703798       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8 │
│ I0613 19:26:47.703883       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ I0613 19:26:47.722832       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-p │
│ I0613 19:26:48.018272       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openst │
│ I0613 19:26:48.018297       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-opens │
│ I0613 19:26:48.627508       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8 │
│ I0613 19:26:48.627607       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ I0613 19:26:48.633401       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-p │
│ I0613 19:26:49.028271       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openst │
│ I0613 19:26:49.028460       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-opens │
│ I0613 19:26:49.621532       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8 │
│ I0613 19:26:49.621561       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ I0613 19:26:50.328682       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-p │
│ I0613 19:26:50.556496       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openst │
│ I0613 19:26:50.556520       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-opens │
│ I0613 19:26:51.596710       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8 │
│ I0613 19:26:51.596802       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ I0613 19:30:23.571335       1 openstackmachine_controller.go:550] "Bootstrap data secret reference is not yet available" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="def │
│ I0613 19:30:23.573423       1 openstackmachine_controller.go:550] "Bootstrap data secret reference is not yet available" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="def │
│ I0613 19:30:23.573891       1 openstackmachine_controller.go:550] "Bootstrap data secret reference is not yet available" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="def │
│ I0613 19:30:23.573950       1 openstackmachine_controller.go:581] "Reconciling Machine" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openstack-control-p │
│ I0613 19:30:23.991123       1 openstackmachine_controller.go:644] "Machine instance state is ACTIVE" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-openst │
│ I0613 19:30:23.991149       1 openstackmachine_controller.go:694] "Reconciling APIServerLoadBalancer" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s-opens │
│ I0613 19:30:24.706063       1 openstackmachine_controller.go:731] "Floating IP already associated to a port" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8 │
│ I0613 19:30:24.706237       1 openstackmachine_controller.go:689] "Reconciled Machine create successfully" controller="openstackmachine" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="OpenStackMachine" OpenStackMachine="default/microk8s- │
│ 

Could it be an issue on the openstack side?

jichenjc commented 2 weeks ago

There is no support for openstack horizon in the autoscaler project: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider, therefore I was thinking to use the cluster api to achieve this https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/clusterapi.

I tried this before (around 1year) and at least at that time it works fine to me for the auto scaler on CAPI + CAPO

for this issue, looks to me the control plane has something wrong so it's not ready yet, are you able to ssh to the deployed VM and try any k8s command there ?