k8snetworkplumbingwg / whereabouts

A CNI IPAM plugin that assigns IP addresses cluster-wide
Apache License 2.0
273 stars 120 forks source link

[BUG] Failed to create the reconcile looper: failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded #389

Closed pallavi-mandole closed 1 day ago

pallavi-mandole commented 7 months ago

Describe the bug reconciler failure was reported when we tried to scale in/out pods. reconciler job scheduled for every 5 minutes but failed to execute with the given error - [error] failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded. [error] failed to create the reconcile looper: failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded [verbose] reconciler failure: failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded.

Current Behavior the deployment of pod replicas to 500, the same for ippam podreferences. But when we scale in replicas to 1, pods are scaled successfully but 130 podreferences are left. I did 2 series of scale in/out and then release uninstall and redeployment - the same issue every time: 130 podreferences left undeleted after scale in.

To Reproduce Steps to reproduce the behavior:

  1. the deployment of pod replicas to 500, the same for ippam podreferences.
  2. But when we scale in replicas to 1, pods are scaled successfully but 130 podreferences are left.
  3. Do these 2 series of scale in/out and then release uninstall and redeployment - the same issue every time: 130 podreferences left undeleted after a scale in.

Environment:

Additional info / context Add any other information / context about the problem here.

adilGhaffarDev commented 7 months ago

@dougbtv kindly check this issue.

smoshiur1237 commented 7 months ago

Got a respone from @dougbtv, get suggestion to disable overlapping IP addresses. We are checking, if this issue can be fixed by disabling it. https://github.com/k8snetworkplumbingwg/whereabouts/tree/master#overlapping-ranges

smoshiur1237 commented 6 months ago

@dougbtv @andreaskaris We are using the overlapping ip ranges feature for the k8s storage backend and disabling it will not serve our goal. I think this is a proper bug which needs your attention and help to overcome the issue.

smoshiur1237 commented 6 months ago

I would like to explain the issue here in steps for better view:

  1. IPs are stored as the ippools CRD of whereabouts, There is an overlapping IP ranges CRD used to store all IPs.[Which creates 1 object for each IP]
  2. The issue is when reconciler code tries to fetch all the objects in under the overlappingipranges. When the number of IPs grow, the number of overlappingipranges objects grow(one for each IP)
  3. Reconciler on the initialization of the the first executor objects tries to list all objects under the overlappingipranges CRD, and hits a context-deadline.

Link of the code where we get the error: code

adilGhaffarDev commented 3 months ago

if it's related to timeout we can fix it in 2 ways:

smoshiur1237 commented 1 month ago

/cc @manuelbuil Hi, We are facing this issue for longtime and couldn't find a fix or workaround to this issue. Would you please take a look of this issue?

marseel commented 1 month ago

Hi all, I am coming from k8s sig-scalability to help you with fixing this issue.

Error client rate limiter Wait returned an error: context deadline exceeded indicates that there were many requests issued at the same time and they were throttled on client-side (nothing do to with k8s control plane itself).

This particular PR: https://github.com/k8snetworkplumbingwg/whereabouts/pull/438 won't really help, it will even make things probably worse, as you will be issuing more calls.

So how it works:

Timeout that you now setting in the context, is not just requests timeout, but timeout for the sum of "waiting in a queue" and "request timeout". If you want to have a timeout for request only, you can specify it by setting TimeoutSeconds in meta.v1.ListOptions instead. For List operations, I would recommend setting the timeout mentioned above to 1 minute, as the official SLO for List requests states 30s for 99% of requests: https://github.com/kubernetes/community/blob/master/sig-scalability/slos/api_call_latency.md#definition (this SLO is just for requests, not waiting in queue).

On a high-level design, I would recommend not using List at all and use informer instead.

Hope this helps :crossed_fingers:

mlguerrero12 commented 1 month ago

Hi @smoshiur1237, I'll work on this.

smoshiur1237 commented 1 month ago

@mlguerrero12 Thanks, I proposed a fix by increasing the RequestTimeout. Would you please take a look, It should fix the issue. PR

mlguerrero12 commented 1 month ago

It might fix the issue for 500 pods but as I mentioned in your pr, we have a customer reporting this issue with 100 nodes and 30k pods. I'll explore other options and let you know.

mlguerrero12 commented 1 month ago

@smoshiur1237, I don't believe this issue can be solved by increasing the request timeout. Also, the reconciler job doesn't make a batch of requests before listing the cluster wide reservations. What it does is to list the pods and ip pools.

The root of the issue is that the this reconciler job is expected to finish in 30 seconds. A context is created with this time and is used as parent for contexts of all requests. So, in large clusters, this parent context expires by the time it has to list the cluster wide reservations. If you check the logic for listing pods, it uses the same time that was set for the parent (30s).

What I'm gonna do is remove this parent context and use 30 seconds for all listing operations (supported by what @marseel mentioned above the SLO). All other type of requests will continue using the RequestTimeout value (10s). I will also use pagination for listing pods and cluster wide reservations.

mlguerrero12 commented 2 weeks ago

@smoshiur1237, @adilGhaffarDev, could you please share complete logs of the reconciler when this issue happens?

smoshiur1237 commented 2 weeks ago

@mlguerrero12 here is the original logs that we got initially on this issue from a running whereabouts pod logs, where I have trimmed similar kinds of instances to make it visible :

2023-10-27T11:50:38Z [debug] the IP reservation: IP: x:y::1:2c is reserved for pod: bat-t1/cnf-complex-t1-2-stor-rwo-0
2023-10-27T11:50:38Z [debug] pod reference bat-t1/cnf-complex-t1-2-stor-rwo-0 matches allocation; Allocation IP: x:y::1:2c; PodIPs: map[x.y.1.44:{} x:y::1:2c:{}]
2023-10-27T11:50:38Z [error] failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded
2023-10-27T11:50:38Z [error] failed to create the reconcile looper: failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded
2023-10-27T11:50:38Z [verbose] reconciler failure: failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded
2023-10-27T11:55:00Z [verbose] starting reconciler run
2023-10-27T11:55:00Z [debug] NewReconcileLooper - inferred connection data
2023-10-27T11:55:00Z [debug] listing IP pools
2023-10-27T11:55:37Z [debug] Added IP x.y.1.130 for pod bat-t1/cnf-complex-t1-1-dpdk-0
2023-10-27T11:55:37Z [debug] Added IP x.y.1.112 for pod bat-t1/cnf-complex-t1-1-dpdk-0
2023-10-27T11:55:37Z [debug] the IP reservation: IP: x:y::1:25 is reserved for pod: bat-t1/cnf-complex-t1-1-stor-rwo-1
2023-10-27T11:55:37Z [debug] pod reference bat-t1/cnf-complex-t1-1-stor-rwo-1 matches allocation; Allocation IP: x:y::1:25; PodIPs: map[x.y.1.37:{} x:y::1:25:{}]
2023-10-27T11:55:37Z [error] failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded
2023-10-27T11:55:37Z [error] failed to create the reconcile looper: failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded
2023-10-27T11:55:37Z [verbose] reconciler failure: failed to list all OverLappingIPs: client rate limiter Wait returned an error: context deadline exceeded
smoshiur1237 commented 1 week ago

Opened a new issue to track pod reference problem: https://github.com/k8snetworkplumbingwg/whereabouts/issues/483