kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.9k stars 4.65k forks source link

Loadbalancer for ig bastion is not created on openstack #16867

Open networkhell opened 2 weeks ago

networkhell commented 2 weeks ago

/kind bug

1. What kops version are you running?

kops version
Client version: 1.30.1

2. What Kubernetes version are you running?

kubectl version
Client Version: v1.31.1
Kustomize Version: v5.4.2
Server Version: v1.30.5

3. What cloud provider are you using? Openstack

4. What commands did you run? What is the simplest way to reproduce this issue?

kops create -f kops-test-fh.k8s.local.yaml --state swift://kops
kops create secret --name kops-fc-fh.k8s.local sshpublickey admin -i ~/.ssh/id_kops.pub --state swift://kops
kops update cluster --name kops-fc-fh.k8s.local --yes --state swift://kops

5. What happened after the commands executed? Cluster and bastion hosts are created as expected. But no loadbalancer is created for the bastion hosts rendering them unusable.

6. What did you expect to happen? As the api spec looks like:

spec:
  topology:
    bastion:
      loadBalancer:
        type: Public
...
apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: null
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: bastions
spec:
  associatePublicIp: false
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 3
  minSize: 3
  role: Bastion
  subnets:
  - muc5-a
  - muc5-b
  - muc5-d

I expect kops to create a loadbalancer with a floating ip for this instance group. Instead no loadbalancer is created for the bastion hosts.

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: null
  name: kops-fc-fh.k8s.local
spec:
  addons:
  - manifest: swift://kops-addons/addon.yaml
  api:
    loadBalancer:
      type: Internal
      useForInternalApi: true
  authorization:
    rbac: {}
  channel: stable
  cloudConfig:
    openstack:
      blockStorage:
        bs-version: v3
        clusterName: kops-fc-fh.k8s.local
        createStorageClass: false
        csiTopologySupport: true
        ignore-volume-az: false
      loadbalancer:
        floatingNetwork: external
        floatingNetworkID: 68a806ee-4eb8-4b50-ae49-c06bde9baf06
        method: ROUND_ROBIN
        provider: amphora
        useOctavia: true
      monitor:
        delay: 15s
        maxRetries: 3
        timeout: 10s
      router:
        externalNetwork: external
  cloudControllerManager:
    clusterName: kops-fc-fh.k8s.local
  cloudProvider: openstack
  configBase: swift://kops/kops-fc-fh.k8s.local
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - instanceGroup: control-plane-muc5-a
      name: a
      volumeType: rbd_fast
    - instanceGroup: control-plane-muc5-b
      name: b
      volumeType: rbd_fast
    - instanceGroup: control-plane-muc5-d
      name: c
      volumeType: rbd_fast
    manager:
      backupInterval: 24h0m0s
      backupRetentionDays: 90
      env:
      - name: ETCD_LISTEN_METRICS_URLS
        value: http://0.0.0.0:8081
      - name: ETCD_METRICS
        value: basic
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - instanceGroup: control-plane-muc5-a
      name: a
      volumeType: rbd_fast
    - instanceGroup: control-plane-muc5-b
      name: b
      volumeType: rbd_fast
    - instanceGroup: control-plane-muc5-d
      name: c
      volumeType: rbd_fast
    manager:
      backupRetentionDays: 90
    memoryRequest: 100Mi
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeAPIServer:
    anonymousAuth: false
    tlsCipherSuites:
    - TLS_AES_128_GCM_SHA256
    - TLS_AES_256_GCM_SHA384
    - TLS_CHACHA20_POLY1305_SHA256
    - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
    - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
    - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
    - TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256
    - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
    - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
    - TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
    tlsMinVersion: VersionTLS13
  kubeDNS:
    nodeLocalDNS:
      cpuRequest: 25m
      enabled: true
      memoryRequest: 5Mi
    provider: CoreDNS
  kubelet:
    anonymousAuth: false
  kubernetesApiAccess:
  - 0.0.0.0/0
  - ::/0
  kubernetesVersion: 1.30.2
  metricsServer:
    enabled: true
    insecure: true
  networkCIDR: 10.42.0.0/24
  networking:
    calico: {}
  nonMasqueradeCIDR: 100.64.0.0/10
  sshAccess:
  - 0.0.0.0/0
  - ::/0
  subnets:
  - cidr: 10.42.0.64/26
    name: muc5-a
    type: Private
    zone: muc5-a
  - cidr: 10.42.0.128/26
    name: muc5-b
    type: Private
    zone: muc5-b
  - cidr: 10.42.0.192/26
    name: muc5-d
    type: Private
    zone: muc5-d
  - cidr: 10.42.0.0/29
    name: utility-muc5-a
    type: Utility
    zone: muc5-a
  - cidr: 10.42.0.8/29
    name: utility-muc5-b
    type: Utility
    zone: muc5-b
  - cidr: 10.42.0.16/29
    name: utility-muc5-d
    type: Utility
    zone: muc5-d
  topology:
    bastion:
      loadBalancer:
        type: Public
    dns:
      type: None

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-09-27T11:26:24Z"
  generation: 1
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: bastions
spec:
  associatePublicIp: false
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 3
  minSize: 3
  role: Bastion
  subnets:
  - muc5-a
  - muc5-b
  - muc5-d

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-09-27T11:26:22Z"
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: control-plane-muc5-a
spec:
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - muc5-a

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-09-27T11:26:22Z"
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: control-plane-muc5-b
spec:
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - muc5-b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-09-27T11:26:22Z"
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: control-plane-muc5-d
spec:
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 1
  minSize: 1
  role: Master
  subnets:
  - muc5-d

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-09-27T11:26:23Z"
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: nodes-muc5-a
spec:
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 1
  minSize: 1
  role: Node
  subnets:
  - muc5-a

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-09-27T11:26:23Z"
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: nodes-muc5-b
spec:
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 1
  minSize: 1
  role: Node
  subnets:
  - muc5-b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2024-09-27T11:26:24Z"
  labels:
    kops.k8s.io/cluster: kops-fc-fh.k8s.local
  name: nodes-muc5-d
spec:
  image: flatcar
  machineType: SCS-2V-8-20s
  maxSize: 1
  minSize: 1
  role: Node
  subnets:
  - muc5-d

8. Please run the commands with most verbose logging by adding the -v 10 flag. Paste the logs into this report, or in a gist and provide the gist link here.

I will provide the logs if necessary for troubleshooting. But it will take some time to redact the log output.

9. Anything else do we need to know?

zetaab commented 1 day ago

what is the use-case of using 3 bastions? Bastion is just jumphost for the controlplanes (and nodes). Which in normal situation is not used at all. People should not login to machines using ssh if everything is working like should. The purpose for bastion host is to provide debug capability if something is not working

networkhell commented 5 hours ago

@zetaab I am testing a multi AZ cluster on openstack so it is handy to have one bastion per availability zone for debugging and testing. I opened this issue because kops documentation explicitly mentions these config in the bastion host section - https://kops.sigs.k8s.io/bastion/

spec:
  topology:
    bastion:
      bastionPublicName: bastion.mycluster.example.com
      loadBalancer:
        type: Public

But neither a loadbalancer is created nor a DNS record is set up for the bastion hosts on openstack.

zetaab commented 5 hours ago

kOps do support multiple providers but there are differences between providers.

At least we are still using old OVS based setup, which means that loadbalancer = 2 virtual machines under the hood which have keepalived setup with haproxy. Its just waste of resources to have loadbalancer in front of bastion. If you want to have one bastion per az, single instancegroup in kops does not guarantee that. It is also pretty difficult to debug if you have loadbalancer in front of these, as you do not know which bastion its going to connect.

networkhell commented 4 hours ago

I agree that this setup does not really make sense when it comes to production clusters. I also agree that it maybe makes no sense to use loadbalancers in front of bastion hosts. But at least the documentation should be clear that these features are not available for openstack. And maybe not 100% in the scope of this issue - setting a bastionPublicName would be a useful feature which currently seems not to work on openstack.