Open MageekChiu opened 2 months ago
@MageekChiu thanks for raising this.
watcherSyncer will wait for 1001 updates and call "ws.callbacks.OnUpdates(updates)" to invoke the onUpdate above
I assume you're looking at this code? https://github.com/projectcalico/calico/blob/77ecdcda90c0a8a38ae247067169523484c6c8e3/libcalico-go/lib/backend/watchersyncer/watchersyncer.go#L156-L166
If so, that only waits for a batch of 1000 updates if we are receiving a rapid fire of updates - if the loop ever executes and there is not a pending update to process, we will send what we have rather than waiting. I don't think this should be a problem, and this logic is used extensively throughout Calico so I am pretty confident it's not waiting here.
I suggest that we just periodically call syncAllAutoHostendpoints of autoHostEndpointController to make sure we have all the expected hostendpoint
This won't help us spot missing nodes, since this logic runs based on the cache of nodes we have received from the syncer. We definitely need to run this off the cache instead of polling nodes, though.
I think we need to dig into why it's actually taking so long to process the new node. My suspicion is that we're actually hitting client side rate limiting here because we aren't using the syncer for HostEndpoint objects, and instead going straight to the API. For a large number of host endpoints, this will cause us to self rate limit requests.
A fix for that would be to stop using the client here: https://github.com/projectcalico/calico/blob/77ecdcda90c0a8a38ae247067169523484c6c8e3/kube-controllers/pkg/controllers/node/hostendpoints.go#L52
And instead replace the Get() and List() calls with a cache fed by the syncer, much like we are doing for nodes.
Do you have a full log from the kube-controllers pod while it is in this state I could look at?
Thanks @caseydavenport for you kind answer.
“I don't think this should be a problem, and this logic is used extensively throughout Calico so I am pretty confident it's not waiting here.” For other components, this might be true, but kube-controller is only one instance watching too many resources, the lag is not impossible.
To get a better view, I slightly changed the code to bellow:
// sendUpdates is used to send the consolidated set of updates. Returns nil.
func (ws *watcherSyncer) sendUpdates(updates []api.Update) []api.Update {
log.WithField("NumUpdates", len(updates)).Info("Sending syncer updates (if any to send)")
if len(updates) > 0 {
ws.callbacks.OnUpdates(updates)
}
log.WithField("NumUpdates", len(updates)).Info("Sending syncer updates is done")
return nil
}
And I turned on the DEBUG log and filtered the logs with grep, bellows are the relevant logs
2024-08-28 02:47:26.330 [INFO][1] watchersyncer.go 255: Sending syncer updates is done NumUpdates=612
2024-08-28 02:47:26.330 [INFO][1] watchersyncer.go 251: Sending syncer updates (if any to send) NumUpdates=1001
2024-08-28 02:50:27.562 [DEBUG][1] watchercache.go 384: Cache entry added, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:27.569 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:27.582 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:27.989 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:28.464 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:37.987 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:39.139 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:50.529 [INFO][1] watchersyncer.go 255: Sending syncer updates is done NumUpdates=1001
2024-08-28 02:50:50.530 [INFO][1] watchersyncer.go 251: Sending syncer updates (if any to send) NumUpdates=1001
2024-08-28 02:50:50.544 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:50.546 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:50:58.648 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:08.106 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:08.116 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:08.476 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:12.943 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:12.963 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:13.187 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:19.367 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:29.772 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:39.994 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:51:50.245 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:52:00.493 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.729 [INFO][1] watchersyncer.go 255: Sending syncer updates is done NumUpdates=1001
2024-08-28 02:54:14.730 [INFO][1] watchersyncer.go 251: Sending syncer updates (if any to send) NumUpdates=875
2024-08-28 02:54:14.743 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.754 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.768 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.783 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.798 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.819 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.836 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.851 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:54:14.868 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:57:12.930 [DEBUG][1] hostendpoints.go 208: syncing hostendpoint for node "work-node-xxxxxxxxxxx". attempt #1
2024-08-28 02:57:12.930 [DEBUG][1] hostendpoints.go 182: syncing hostendpoint "work-node-xxxxxxxxxxx-auto-hep" from node &{TypeMeta:{Kind:Node APIVersion:projectcalico.org/v3}
2024-08-28 02:57:12.930 [DEBUG][1] customresource.go 250: Get custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep
2024-08-28 02:57:12.930 [DEBUG][1] customresource.go 261: Get custom Kubernetes resource by name Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.129 [DEBUG][1] customresource.go 269: Error getting resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.134 [DEBUG][1] validator.go 329: Validate name: work-node-xxxxxxxxxxx
2024-08-28 02:57:13.134 [DEBUG][1] customresource.go 97: Create custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Resource="HostEndpoints"
2024-08-28 02:57:13.330 [DEBUG][1] hostendpoints.go 200: successfully synced hostendpoint hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.330 [INFO][1] watchersyncer.go 255: Sending syncer updates is done NumUpdates=875
2024-08-28 02:57:13.330 [INFO][1] watchersyncer.go 251: Sending syncer updates (if any to send) NumUpdates=126
2024-08-28 02:57:13.330 [DEBUG][1] hostendpoints.go 208: syncing hostendpoint for node "work-node-xxxxxxxxxxx". attempt #1
2024-08-28 02:57:13.331 [DEBUG][1] hostendpoints.go 182: syncing hostendpoint "work-node-xxxxxxxxxxx-auto-hep" from node &{TypeMeta:{Kind:Node APIVersion:projectcalico.org/v3}
2024-08-28 02:57:13.331 [DEBUG][1] customresource.go 250: Get custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Resource="HostEndpoints" Revision=""
2024-08-28 02:57:13.331 [DEBUG][1] customresource.go 261: Get custom Kubernetes resource by name Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.340 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
current: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"HostEndpoint", APIVersion:"projectcalico.org/v3"},
expected: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""},
2024-08-28 02:57:13.530 [DEBUG][1] hostendpoints.go 385: hostendpoint does not need update hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.530 [DEBUG][1] hostendpoints.go 405: hostendpoint not updated hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.530 [DEBUG][1] hostendpoints.go 200: successfully synced hostendpoint hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.530 [DEBUG][1] hostendpoints.go 208: syncing hostendpoint for node "work-node-xxxxxxxxxxx". attempt #1
2024-08-28 02:57:13.530 [DEBUG][1] hostendpoints.go 182: syncing hostendpoint "work-node-xxxxxxxxxxx-auto-hep" from node &{TypeMeta:{Kind:Node APIVersion:projectcalico.org/v3}
2024-08-28 02:57:13.530 [DEBUG][1] customresource.go 250: Get custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Resource="HostEndpoints" Revision=""
2024-08-28 02:57:13.530 [DEBUG][1] customresource.go 261: Get custom Kubernetes resource by name Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
current: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"HostEndpoint", APIVersion:"projectcalico.org/v3"},
expected: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""},
2024-08-28 02:57:13.730 [DEBUG][1] hostendpoints.go 385: hostendpoint does not need update hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.730 [DEBUG][1] hostendpoints.go 405: hostendpoint not updated hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.730 [DEBUG][1] hostendpoints.go 200: successfully synced hostendpoint hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:13.857 [DEBUG][1] watchercache.go 421: Marking key as re-processed Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:57:13.857 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:57:14.130 [DEBUG][1] hostendpoints.go 208: syncing hostendpoint for node "work-node-xxxxxxxxxxx". attempt #1
2024-08-28 02:57:14.130 [DEBUG][1] hostendpoints.go 182: syncing hostendpoint "work-node-xxxxxxxxxxx-auto-hep" from node &{TypeMeta:{Kind:Node APIVersion:projectcalico.org/v3}
2024-08-28 02:57:14.130 [DEBUG][1] customresource.go 250: Get custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Resource="HostEndpoints" Revision=""
2024-08-28 02:57:14.130 [DEBUG][1] customresource.go 261: Get custom Kubernetes resource by name Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
current: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"HostEndpoint", APIVersion:"projectcalico.org/v3"},
expected: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""},
2024-08-28 02:57:14.329 [DEBUG][1] hostendpoints.go 385: hostendpoint does not need update hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:14.329 [DEBUG][1] hostendpoints.go 405: hostendpoint not updated hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:14.329 [DEBUG][1] hostendpoints.go 200: successfully synced hostendpoint hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:15.329 [DEBUG][1] hostendpoints.go 208: syncing hostendpoint for node "work-node-xxxxxxxxxxx". attempt #1
2024-08-28 02:57:15.329 [DEBUG][1] hostendpoints.go 182: syncing hostendpoint "work-node-xxxxxxxxxxx-auto-hep" from node &{TypeMeta:{Kind:Node APIVersion:projectcalico.org/v3}
2024-08-28 02:57:15.329 [DEBUG][1] customresource.go 250: Get custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Resource="HostEndpoints" Revision=""
2024-08-28 02:57:15.329 [DEBUG][1] customresource.go 261: Get custom Kubernetes resource by name Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
current: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"HostEndpoint", APIVersion:"projectcalico.org/v3"},
expected: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""},
2024-08-28 02:57:15.529 [DEBUG][1] hostendpoints.go 385: hostendpoint does not need update hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:15.529 [DEBUG][1] hostendpoints.go 405: hostendpoint not updated hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:15.529 [DEBUG][1] hostendpoints.go 200: successfully synced hostendpoint hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:33.929 [DEBUG][1] hostendpoints.go 208: syncing hostendpoint for node "work-node-xxxxxxxxxxx". attempt #1
2024-08-28 02:57:33.929 [DEBUG][1] hostendpoints.go 182: syncing hostendpoint "work-node-xxxxxxxxxxx-auto-hep" from node &{TypeMeta:{Kind:Node APIVersion:projectcalico.org/v3}
2024-08-28 02:57:33.929 [DEBUG][1] customresource.go 250: Get custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Resource="HostEndpoints" Revision=""
2024-08-28 02:57:33.929 [DEBUG][1] customresource.go 261: Get custom Kubernetes resource by name Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
current: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"HostEndpoint", APIVersion:"projectcalico.org/v3"},
expected: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""},
2024-08-28 02:57:34.129 [DEBUG][1] hostendpoints.go 385: hostendpoint does not need update hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:34.129 [DEBUG][1] hostendpoints.go 405: hostendpoint not updated hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:34.129 [DEBUG][1] hostendpoints.go 200: successfully synced hostendpoint hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:36.329 [DEBUG][1] hostendpoints.go 208: syncing hostendpoint for node "work-node-xxxxxxxxxxx". attempt #1
2024-08-28 02:57:36.330 [DEBUG][1] hostendpoints.go 182: syncing hostendpoint "work-node-xxxxxxxxxxx-auto-hep" from node &{TypeMeta:{Kind:Node APIVersion:projectcalico.org/v3}
2024-08-28 02:57:36.330 [DEBUG][1] customresource.go 250: Get custom Kubernetes resource Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Resource="HostEndpoints" Revision=""
2024-08-28 02:57:36.330 [DEBUG][1] customresource.go 261: Get custom Kubernetes resource by name Key=HostEndpoint(work-node-xxxxxxxxxxx-auto-hep) Name="work-node-xxxxxxxxxxx-auto-hep"
current: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"HostEndpoint", APIVersion:"projectcalico.org/v3"},
expected: &v3.HostEndpoint{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""},
2024-08-28 02:57:36.529 [DEBUG][1] hostendpoints.go 385: hostendpoint does not need update hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:36.529 [DEBUG][1] hostendpoints.go 405: hostendpoint not updated hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:36.529 [DEBUG][1] hostendpoints.go 200: successfully synced hostendpoint hep.Name="work-node-xxxxxxxxxxx-auto-hep"
2024-08-28 02:57:39.130 [INFO][1] watchersyncer.go 255: Sending syncer updates is done NumUpdates=126
2024-08-28 02:57:39.130 [INFO][1] watchersyncer.go 251: Sending syncer updates (if any to send) NumUpdates=314
2024-08-28 02:57:39.153 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:57:39.170 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
2024-08-28 02:57:39.185 [DEBUG][1] watchercache.go 372: Datastore entry modified, sending syncer update Key="Node(work-node-xxxxxxxxxxx)" ListRoot="/calico/resources/v3/projectcalico.org/nodes"
Node(work-node-xxxxxxxxxxx) joined the cluster at 2024-08-28 02:50:27.562 and the watchercache send the update to watchersyncer, but the watchersyncer was still busy dealing with the 1001 updates from 2024-08-28 02:47:26.330 After it dealt with this 1001 updates then it dealt with another 1001 updates from 2024-08-28 02:50:50.530. And then it continued dealing with the 875 updates from 2024-08-28 02:54:14.730, during this batch the node was handled at 2024-08-28 02:57:12.930 and the corresponding hep was created at 2024-08-28 02:57:13.330.
So it took nearly 7 mins to get the hep created and this is only a small cluster(dozens of nodes), you can see a lot of "Datastore entry modified" updates, and for these updates, we need to compare IP,labels for the relevant heps, but most likely we end up with "hostendpoint does not need update" That's why I said we waste too much time dealing with useless updates, and we should just use a periodic-checking mechanism like what felix does to iptables/routes.
@MageekChiu thanks for the logs. This further confirms my theory from my previous comment. The watcher syncer isn't waiting for 1000 updates before sending.
You can see that clearly here:
2024-08-28 02:47:26.330 [INFO][1] watchersyncer.go 255: Sending syncer updates is done NumUpdates=612
What is happening is that the watcher syncer is calling OnUpdates
with a batch, and the code handling that update is taking too long to process those updates. The "correct" way would be for the host endpoint controller code to simply cache the incoming updates and then perform the necessary work on a separate goroutine instead of blocking OnUpdates
until the work is complete.
The problem is almost certainly that this function gets ratelimited: https://github.com/caseydavenport/calico/blob/06ec437ae0a011a3d1aff9c08488bffd6eab78fc/kube-controllers/pkg/controllers/node/hostendpoints.go#L74-L112
Due to direct API access here: https://github.com/caseydavenport/calico/blob/06ec437ae0a011a3d1aff9c08488bffd6eab78fc/kube-controllers/pkg/controllers/node/hostendpoints.go#L185
That's why I said we waste too much time dealing with useless updates, and we should just use a periodic-checking mechanism like what felix does to iptables/routes.
Polling is the wrong answer to this problem - polling is going to add substantial load to the cluster and will cause other scale issues.
Compare this to the IPAM controller implementation, which simply puts the update on a channel to another goroutine and returns: https://github.com/caseydavenport/calico/blob/06ec437ae0a011a3d1aff9c08488bffd6eab78fc/kube-controllers/pkg/controllers/node/ipam.go#L220-L232
This prevents blocking the syncer from sending more updates.
Yes, a local cache and a dedicated working goroutine is the answer. The core idea from my side is that we should not react to every update, updates might be too many and useless.
Polling is the wrong answer to this problem - polling is going to add substantial load to the cluster and will cause other scale issues.
True, calling syncAllAutoHostendpoints too often would cause this problem. I did not think it through. But just to discuss, is calling it every 5 mins too often?
After seeing the IPAM controller, it comes to mind that a periodic-checking mechanism does not necessarily means polling from the server, reading from local cache is a core feature of k8s informer, I think, correct me if I am wrong, Thanks
Due to direct API access here: https://github.com/caseydavenport/calico/blob/06ec437ae0a011a3d1aff9c08488bffd6eab78fc/kube-controllers/pkg/controllers/node/hostendpoints.go#L185
I suspected rate limit too, but I did not find relevant logs. Could you help pointing out how can I confirm this? The k8s server logs?
But just to discuss, is calling it every 5 mins too often?
Polling every 5 minutes would mean we at best detect new config every 5 minutes.
Hi @caseydavenport , may I ask if you are going to work on this, or should I submit a PR? Thank you.
@MageekChiu If you feel comfortable with the changes, feel free to make a PR - I would like to fix it but I won't have time for a while.
I think the changes I would like to see are:
Get()
and List()
calls to instead use the cache.OnUpdates
implementation in the HEP controller to send updates via a channel, perform work on a separate goroutine (following the pattern of the IPAM controller). Hi @caseydavenport ,as I was crafting the PR, two things came to mind and I'd like to talk about here:
Yes, the ultimate desired pattern here is to use a separate goroutine - that's what allows us to offload the majority of the processing so that it doesn't block the OnUpdates() call. However:
Here's an example of that second point:
func OnUpdate(upd Update) {
c.updateChan <- upd
}
func reconcile() {
switch {
case upd <- c.updateChan:
// Got an update - add it to our cache.
cache[upd.Key] = upd
case <-time.After(period):
// Perform a sync based on the cache
sync()
}
}
So, all cache access would happen within go reconcile()
and the OnUpdate
function would simply send the updates through to the shared loop.
In a large k8s cluster(say more than 100 nodes), when a node joins the cluster, it relies on its hostendpoint(auto created by kube-controller) to get proper network connectivity(cause we use label based policy), but it might take many minutes, even hours to see the hostendpoint created, leading to the unavailability of this node.
Expected Behavior
I expect the hostendpoint to be created within a predictable timeframe
Possible Solution
I've walked through relevant codes and done some test and here are some of my observations:
To summarize, the event-driven mechanism we have now for autoHostEndpointController spends most of its time checking useless updates, causing truly useful updates(like a new node) to be unable to get dealt with in a timely manner(cause we don't know how many updates are before the new creation update and thus don't know how long does it take before this update can be dealt with). I suggest we use a periodic-checking mechanism to make sure all the hostendpoints can become what's expected within a predictable timeframe
A workaround is to restart the calico-kube-controller, which would start a full resync and the hostendpoint can be created right away, but I don't think it's an elegant solution
Steps to Reproduce (for bugs)