kubernetes / cloud-provider-openstack

Apache License 2.0
616 stars 601 forks source link

[occm] The provisioning duration of LoadBalancer type services increases significantly with the number of cluster nodes. #1595

Closed jelinek-wgs closed 3 years ago

jelinek-wgs commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind feature

What happened:

At the beginning, provisioning a new LoadBalancer type service took ~5min, one service takes ~15min now.

What you expected to happen:

We would expect that after the load balancer instance is provisioned, a floatingIP is assigned and min. 1 member (or 2-3 for HA) are provisioned in the load balancer, the load balancer is returned to the cluster as ready and in the "background" the remaining members (K8s nodes) are added.

How to reproduce it:

Provision a <= 12 worker node cluster and get a loadbalancer provisioned from occm using the loadbalancer type service and octavia. Then provision more nodes about 50>=node count<=60 and provision a LoadBalancer Type Service again.

Anything else we need to know?:

We observed that OCCM serially adds the member list from the openstack load balancer and only when the last node is added, that service is reported as available in the Kubernetes cluster. This naturally leads to the fact that the provisioning of a LoadBalancer service grows with the number of K8s nodes.

In our case it even happened that the user specified 2 ports on the service and thus OCCM iterated the list of nodes 2x serially and thus the provisioning time of an instance increased to 32 min. So in our case there would be an additional provisioning time of 15min per additional port.

Environment:

lingxiankong commented 3 years ago

Hi, thanks for reporting this issue, that behavior has actually been changed since release-1.19, have you tried e.g. k8scloudprovider/openstack-cloud-controller-manager:v1.19.2?

jelinek-wgs commented 3 years ago

Hi @lingxiankong, thank you! It worked perfectly fine. The time has been reduced to 4m.