Closed hmlkao closed 2 years ago
@hmlkao a few things to point out:
What makes you think that Calico is at fault here? It very well might be, but from what I can tell the pod goes into Init
state at 11:12:13, and then calico/node learns about the IP 11:12:18, only 5 seconds later. That suggests that the process of launching a container, allocating an IP, configuring routes and sysctls, and communicating the IP to calico/node takes 5 seconds which is pretty reasonable.
12:13:22 - IP address assigned to pod
How are you measuring this? From the log, it appears the IP is assigned well before that (at 11:12:18)
If I had to guess, what you are seeing is that Calico is allocating the IP addresses immediately, but that kubelet is not reporting the IP back to the API until some time later due to batching / rate-limiting mentioned above. This is the original issue about that, which wasn't deemed serious enough to fix: https://github.com/kubernetes/kubernetes/issues/39113
@caseydavenport , thanks for your response and notes.
What I was tough is that the delay is created between Calico allocate an IP address from range and writing it to "somewhere" (I don't know how it is processed) on node. But as you mentioned the delay can be caused by kubelet before it report it back to API. However, pod is able to create connections to network after the IP address is reported which should be IMO independent from reporting IP address to API (I suppose that the connection should be available right after the interface is created). It's little bit confusing for me.
Maybe I could check interfaces created on node during a test (which I didn't). Or can you advise me what would I should look for?
How are you measuring this?
"Measured" as simply as I could by refreshing the kubectl
in terminal
Thanks for the link to issue, I'll try some tests with modified kubelet
parameters.
Unfortunately, changing of kubelet
parameters (kube-api-qps
, kube-api-burst
, event-qps
and event-burst
) doesn't help. The delay is still the same.
@hmlkao as discovered on the original issue, those parameters do help somewhat in extreme circumstances but they are not the only problem here. Those are API client ratelimiting parameters, whereas this is a kubelet specific piece of batching code which further limits this, which I do not believe to be configurable.
"Measured" as simply as I could by refreshing the kubectl in terminal
The vast majority of the time before this shows up in kubectl is likely to be outside of Calico's control. Based on the logs you provided Calico appears to be assigning an address and returning it to Kubernetes quickly, so I don't think this is a Calico issue, rather a problem elsehwere in the system.
pod is able to create connections to network after the IP address is reported which should be IMO independent from reporting IP address to API
I would expect the pod to have network access very shortly after the IP is assigned to the interface, which will happen before it is reported in the API. However, note that some network functions (like services) require the IP to appear in the API in order to function because the IP needs to be known on remote nodes as well, so not all network access will succeed until the IP does appear in the API.
Or can you advise me what would I should look for?
12:12:32 - connect-inject-init container start looking for service
Could you explain what this init container does? Is this an init container in the pod that is launching?
Sorry for the delay, I want to give you a comprehensive answer so I'd to read some more information.
Could you explain what this init container does? Is this an init container in the pod that is launching?
This init container is part of Consul Connect (service mesh implementation by Consul), container is injected to pod by Consul mutating webhook, this container is trying to open connection to Consul Agent daemonset on node (via HOST_IP
and nodePort
), it is looking for associated service registered in Consul catalog to prepare config for Envoy sidecar running beside app container.
There is K8s deployment and pod of app: deployment.yaml.txt, pod.yaml.txt
I thought that it could be a problem with overloaded Consul Agent so I did a test with app without this init container and the behavior is similar. So it isn't caused by Consul at all.
I've done more tests with check when is container created on node and looked at tons of logs and I'm more and more convinced that the problem is outside of Calico scope. So thank you very much for cooperation, I'll try to investigate elsewhere.
Ok cool, I'm going to close this for now but if you do find more evidence that this is related to Calico just shout and we can reopen.
Expected Behavior
When I scale up deployment from 1 to 100 replicas which are all scheduled to one node, it takes about 2-3 minutes then all of them obtain IP address. I expect that IPs will be assigned much faster.
Current Behavior
The time between the scheduling the pod to node and assigning of IP address to this pod increases according to the amount of pods scheduled to this node. In other words when I set 10 replicas addresses will be assigned much faster then I set 100 replicas.
Possible Solution
I'm not sure where exactly the delay occurs, see the Example below.
I've tried:
calico-node
pods much more resources but without any difference in behaviour.calico-node
pods also without success.Steps to Reproduce (for bugs)
Context
When dev team release new app version and many of pods are scheduled to one of workers there is significant delay between pod is really Running. We are running K8s on powerful bare metal workers (about 100 vCPUs and 512 GB RAM) using Calico as CNI.
Your Environment
Example timeline of issue
Test done on one "clean" worker in our lab environment without any other workload, logs were take from console,
consul-connect-inject-init
container within an app pod andcalico-node
podFull logs are in files consul-init.log and calico-node.log.
demo-ondra-appmain-dev
scaled to 100 replicasdemo-ondra-appmain-dev-798749569d-zjdtl
) created (in Pending state, scheduled tok8s-dc1-016
)calico-node
pod first record about this podcalico-node
chose IP address to be assigned to pod < YES! since this time is IP knownconnect-inject-init
container start looking for serviceconnect-inject-init
container is succesful > Init containers were completed > Running stateIf you need any other details, let me know.
It would be great to know where delay occurs and what is bottle neck of this behavior.