kubernetes / cloud-provider-openstack

Apache License 2.0
619 stars 610 forks source link

[occm] Error deleting loadbalancer since release-1.20 #1636

Closed eumel8 closed 3 years ago

eumel8 commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

Since 1.20 release we have problems delete loadbalancer service:

I0901 07:32:18.699090 1 event.go:291] "Event occurred" object="ingress-nginx/otc-lb" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to delete load balancer: error deleting obsolete pool 63b290c7-3ba1-4484-a514-21816f387859 for listener 4507fdc9-c24d-45be-a603-4d3ca375e82f: Expected HTTP response code [] when accessing [DELETE https://vpc.eu-de.otc.t-systems.com/v2.0/lbaas/pools/63b290c7-3ba1-4484-a514-21816f387859], but got 409 instead\n{\"NeutronError\": {\"detail\": \"\", \"message\": \"member c6458b52-cef8-404f-bfc6-052d83c06656 is using this pool\", \"type\": \"EntityInUse\"}}"

What you expected to happen:

I0901 08:30:45.002725       1 event.go:291] "Event occurred" object="ingress-nginx/otc-lb" kind="Service" apiVersion="v1" type="Normal" reason="DeletedLoadBalancer" message="Deleted load balancer"

How to reproduce it:

This error occured between release-1.19 and release-1.20. After comparing the commits I spotted https://github.com/kubernetes/cloud-provider-openstack/pull/1252 After reverting this commit loadbalancer will deleted as expected very fast. Not sure how can this more tuned.

Anything else we need to know?:

@sbueringer FYI

Environment:

jichenjc commented 3 years ago

@eumel8 have you have chance to test latest 1.22 or master? whether the problem still occurs due to #1252 ?

eumel8 commented 3 years ago

@jichenjc unfortunatelly the same error

E0906 08:12:18.182017       1 controller.go:307] error processing service ingress-nginx/otc-lb (will retry): failed to delete load balancer: error deleting obsolete pool 3316574c-6bf6-4973-acb8-724af0f7db1d for listener 3f0552a3-d171-4603-9b89-6f2403f70b6f: Expected HTTP response code [202 204] when accessing [DELETE https://vpc.eu-de.otc.t-systems.com/v2.0/lbaas/pools/3316574c-6bf6-4973-acb8-724af0f7db1d], but got 409 instead
{"NeutronError": {"detail": "", "message": "member 91a6b67e-81da-4839-ba1b-511724a3cc00 is using this pool", "type": "EntityInUse"}}
I0906 08:12:18.182079       1 event.go:291] "Event occurred" object="ingress-nginx/otc-lb" kind="Service" apiVersion="v1" type="Warning" reason="SyncLoadBalancerFailed" message="Error syncing load balancer: failed to delete load balancer: error deleting obsolete pool 3316574c-6bf6-4973-acb8-724af0f7db1d for listener 3f0552a3-d171-4603-9b89-6f2403f70b6f: Expected HTTP response code [202 204] when accessing [DELETE https://vpc.eu-de.otc.t-systems.com/v2.0/lbaas/pools/3316574c-6bf6-4973-acb8-724af0f7db1d], but got 409 instead\n{\"NeutronError\": {\"detail\": \"\", \"message\": \"member 91a6b67e-81da-4839-ba1b-511724a3cc00 is using this pool\", \"type\": \"EntityInUse\"}}"
jichenjc commented 3 years ago

ok, I searched the code in CPO and didn't find it but found in CAPI https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/master/controllers/openstackcluster_controller.go#L148 so I assume you are using CAPI + CPO ?

and @lingxiankong do we have any LB test in e2e? if so curious why we didn't find the error maybe due to different env? as #1252 is an optimization so maybe we can revert that and in the mean time dig into what's wrong in @eumel8 's env then provide another fix..

lingxiankong commented 3 years ago

@eumel8 are you using Octavia or Neutron-LBaaS?

lingxiankong commented 3 years ago

and @lingxiankong do we have any LB test in e2e? if so curious why we didn't find the error maybe due to different env? as #1252 is an optimization so maybe we can revert that and in the mean time dig into what's wrong in @eumel8 's env then provide another fix..

Let's grab more info before deciding.

eumel8 commented 3 years ago

@lingxiankong cloud.conf runs with use-octavia = false. This would have the only option to run Neutron-LBaaS

lingxiankong commented 3 years ago

Unfortunately, we stopped maintaining Neutron-LBaaS since release 1.17 (or 1.18?)