Open bpradipt opened 2 years ago
Do we have a baseline for what "startup time ... using Kata with local hypervisor" is?
@jtumber-ibm I'll get some baseline pod startup time data and update this issue.
@bpradipt @jtumber-ibm Tested on AWS Openshift cluster
Following startup time was observed For peer pod t3.small instance - 1 minute 41 second
Name: hello-openshift
Namespace: default
Priority: 0
Node: ip-10-0-187-226.us-east-2.compute.internal/10.0.187.226
Start Time: Wed, 08 Feb 2023 18:11:16 +0530
Labels: app=hello-openshift
Annotations: [k8s.v1.cni.cncf.io/network-status](http://k8s.v1.cni.cncf.io/network-status):
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.128.2.22"
],
"default": true,
"dns": {}
}]
[k8s.v1.cni.cncf.io/networks-status](http://k8s.v1.cni.cncf.io/networks-status):
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.128.2.22"
],
"default": true,
"dns": {}
}]
[kata.peerpods.io/instance_type](http://kata.peerpods.io/instance_type): t2.small
Status: Running
IP: 10.128.2.22
IPs:
IP: 10.128.2.22
Containers:
hello-openshift:
Container ID: cri-o://07062a709864f4e32723adfe1fd9c9d51464d76e9ca938dc286373c1d5f5fb85
Image: [quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)
Image ID: [quay.io/openshift/origin-hello-openshift@sha256:369c8324bea625fdb94c1e8cc992ae9d2b0fe780acace7b8c8080349792f3b96](http://quay.io/openshift/origin-hello-openshift@sha256:369c8324bea625fdb94c1e8cc992ae9d2b0fe780acace7b8c8080349792f3b96)
Port: 8888/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 08 Feb 2023 18:12:47 +0530
Ready: True
Restart Count: 0
Limits:
[kata.peerpods.io/vm](http://kata.peerpods.io/vm): 1
Requests:
[kata.peerpods.io/vm](http://kata.peerpods.io/vm): 1
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-ntdth (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-ntdth:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: [node.kubernetes.io/not-ready:NoExecute](http://node.kubernetes.io/not-ready:NoExecute) op=Exists for 300s
[node.kubernetes.io/unreachable:NoExecute](http://node.kubernetes.io/unreachable:NoExecute) op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal AddedInterface 100s multus Add eth0 [10.128.2.22/23] from openshift-sdn
Normal Pulling 14s kubelet Pulling image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)"
Normal Pulled 14s kubelet Successfully pulled image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)" in 609.587941ms
Normal Created 11s kubelet Created container hello-openshift
Normal Started 11s kubelet Started container hello-openshift
For peer pod c5.xlarge instance - 2 minute 45 second
Name: hello-openshift
Namespace: default
Priority: 0
Node: ip-10-0-187-226.us-east-2.compute.internal/10.0.187.226
Start Time: Wed, 08 Feb 2023 18:00:31 +0530
Labels: app=hello-openshift
Annotations: [k8s.v1.cni.cncf.io/network-status](http://k8s.v1.cni.cncf.io/network-status):
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.128.2.21"
],
"default": true,
"dns": {}
}]
[k8s.v1.cni.cncf.io/networks-status](http://k8s.v1.cni.cncf.io/networks-status):
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.128.2.21"
],
"default": true,
"dns": {}
}]
[kata.peerpods.io/instance_type](http://kata.peerpods.io/instance_type): t2.small
Status: Running
IP: 10.128.2.21
IPs:
IP: 10.128.2.21
Containers:
hello-openshift:
Container ID: cri-o://cbd66cd5255be1d61a0fa4440b8480ffe8a9b89b71dc288a2fef561f81980b90
Image: [quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)
Image ID: [quay.io/openshift/origin-hello-openshift@sha256:369c8324bea625fdb94c1e8cc992ae9d2b0fe780acace7b8c8080349792f3b96](http://quay.io/openshift/origin-hello-openshift@sha256:369c8324bea625fdb94c1e8cc992ae9d2b0fe780acace7b8c8080349792f3b96)
Port: 8888/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 08 Feb 2023 18:01:56 +0530
Ready: True
Restart Count: 0
Limits:
[kata.peerpods.io/vm](http://kata.peerpods.io/vm): 1
Requests:
[kata.peerpods.io/vm](http://kata.peerpods.io/vm): 1
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dwdsx (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-dwdsx:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: [node.kubernetes.io/not-ready:NoExecute](http://node.kubernetes.io/not-ready:NoExecute) op=Exists for 300s
[node.kubernetes.io/unreachable:NoExecute](http://node.kubernetes.io/unreachable:NoExecute) op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal AddedInterface 2m10s multus Add eth0 [10.128.2.21/23] from openshift-sdn
Normal Pulling 50s kubelet Pulling image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)"
Normal Pulled 49s kubelet Successfully pulled image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)" in 549.836917ms
Normal Created 47s kubelet Created container hello-openshift
Normal Started 47s kubelet Started container hello-openshift
For runc pod (runtime default) - 7 seconds
Name: nginx
Namespace: default
Priority: 0
Node: ip-10-0-219-218.us-east-2.compute.internal/10.0.219.218
Start Time: Wed, 08 Feb 2023 17:47:28 +0530
Labels: run=nginx
Annotations: [k8s.v1.cni.cncf.io/network-status](http://k8s.v1.cni.cncf.io/network-status):
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.131.0.18"
],
"default": true,
"dns": {}
}]
[k8s.v1.cni.cncf.io/networks-status](http://k8s.v1.cni.cncf.io/networks-status):
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.131.0.18"
],
"default": true,
"dns": {}
}]
Status: Running
IP: 10.131.0.18
IPs:
IP: 10.131.0.18
Containers:
nginx:
Container ID: cri-o://17418aed26b8c36b4aca40f3724c620ee4022440109593ef0dcf2449939f9994
Image: [quay.io/bpradipt/http-echo:nonroot](http://quay.io/bpradipt/http-echo:nonroot)
Image ID: [quay.io/bpradipt/http-echo@sha256:ec3c0642ff30050ea3f462c8f662ffa53036f6a3e089085714c7ff13501002dc](http://quay.io/bpradipt/http-echo@sha256:ec3c0642ff30050ea3f462c8f662ffa53036f6a3e089085714c7ff13501002dc)
Port: <none>
Host Port: <none>
Args:
-text
Hello from Kata container
-listen
:8080
State: Running
Started: Wed, 08 Feb 2023 17:47:33 +0530
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-77wkg (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-77wkg:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: [node.kubernetes.io/not-ready:NoExecute](http://node.kubernetes.io/not-ready:NoExecute) op=Exists for 300s
[node.kubernetes.io/unreachable:NoExecute](http://node.kubernetes.io/unreachable:NoExecute) op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12s default-scheduler Successfully assigned default/nginx to ip-10-0-219-218.us-east-2.compute.internal
Normal AddedInterface 10s multus Add eth0 [10.131.0.18/23] from openshift-sdn
Normal Pulling 9s kubelet Pulling image "[quay.io/bpradipt/http-echo:nonroot](http://quay.io/bpradipt/http-echo:nonroot)"
Normal Pulled 7s kubelet Successfully pulled image "[quay.io/bpradipt/http-echo:nonroot](http://quay.io/bpradipt/http-echo:nonroot)" in 2.274906876s
Normal Created 7s kubelet Created container nginx
Normal Started 7s kubelet Started container nginx
Thanks a lot @abhbaner. Can you please confirm the pod VM image distro - CentOS, RHEL or Ubuntu ?
The podVM image distro is CentOs.
Putting the timing data for easier understanding
runc
Normal Scheduled 12s default-scheduler Successfully assigned default/nginx to ip-10-0-219-218.us-east-2.compute.internal
Normal AddedInterface 10s multus Add eth0 [10.131.0.18/23] from openshift-sdn
Normal Pulling 9s kubelet Pulling image "[quay.io/bpradipt/http-echo:nonroot](http://quay.io/bpradipt/http-echo:nonroot)"
Normal Pulled 7s kubelet Successfully pulled image "[quay.io/bpradipt/http-echo:nonroot](http://quay.io/bpradipt/http-echo:nonroot)" in 2.274906876s
Normal Created 7s kubelet Created container nginx
Normal Started 7s kubelet Started container nginx
peer-pod (c5.xlarge)
Normal AddedInterface 2m10s multus Add eth0 [10.128.2.21/23] from openshift-sdn
Normal Pulling 50s kubelet Pulling image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)"
Normal Pulled 49s kubelet Successfully pulled image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)" in 549.836917ms
Normal Created 47s kubelet Created container hello-openshift
Normal Started 47s kubelet Started container hello-openshift
peer-pod(t3.small)
Normal AddedInterface 100s multus Add eth0 [10.128.2.22/23] from openshift-sdn
Normal Pulling 14s kubelet Pulling image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)"
Normal Pulled 14s kubelet Successfully pulled image "[quay.io/openshift/origin-hello-openshift](http://quay.io/openshift/origin-hello-openshift)" in 609.587941ms
Normal Created 11s kubelet Created container hello-openshift
Normal Started 11s kubelet Started container hello-openshift
A rough sequence diagram of the Pod VM startup process
@jtumber-ibm @stevenhorsman @surajssd @yoheiueda @huoqifeng
I did the same experiments on my cluster in IBM Cloud VPC.
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4s default-scheduler Successfully assigned default/nginx to k8s-worker
Normal Pulling 4s kubelet Pulling image "nginx"
Normal Pulled 3s kubelet Successfully pulled image "nginx" in 1.173574561s
Normal Created 3s kubelet Created container nginx
Normal Started 3s kubelet Started container nginx
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 44s default-scheduler Successfully assigned default/nginx to k8s-worker
Normal Pulling 7s kubelet Pulling image "nginx"
Normal Pulled 3s kubelet Successfully pulled image "nginx" in 4.218585385s
Normal Created 3s kubelet Created container nginx
Normal Started 1s kubelet Started container nginx
I calculated an elapsed time in each state transition as follows. I think the container image is already cached in the case of runc.
State Transition | runc | peer-pod |
---|---|---|
Scheduled→Pulling | 0s | 37s |
Pulling→Pulled | 1s | 4s |
Pulled→Created | 0s | 0s |
Created→Started | 0s | 2s |
I suspect getting close to a baseline with local kata on kvm/qemu will be tough. I don't have concrete measurements at the moment, but time-to-pod on local kata was a matter of a few seconds (~3-10s), even for an SEV-SNP enabled guest, which has a bit of an init overhead.
Below is a measurement from a CAA test, it's not broken down by the individual CAA steps, but you can kind of extrapolate from the k8s logs:
agent-protocol-forwarder
to be responsive and kata-agent's pod provisioning (pulling images, etc).systemd-analyze plot
on a peer-pod vm can yield possible opportunities for improvement.kubectl get events -o custom-columns=Timestamp:.firstTimestamp,Message:.message --field-selector involvedObject.name=nginx-7df44f78b6-tgxnx
Timestamp Message
2023-01-27T08:54:06Z Successfully assigned default/nginx-7df44f78b6-tgxnx to k8s-ci
2023-01-27T08:54:51Z Container image "nginx:1.14.2" already present on machine
2023-01-27T08:54:51Z Created container nginx
2023-01-27T08:54:53Z Started container nginx
2023-01-27T08:54:53Z Container image "*" already present on machine
2023-01-27T08:54:53Z Created container sidecar
2023-01-27T08:54:55Z Started container sidecar
Some additional data purely from the pod VM image standpoint. I used both CentOS Stream 8 and Ubuntu 20.04 cloud images for relative comparison.
CentOS Stream8 Cloud image
$ oc get pods
NAME READY STATUS RESTARTS AGE
echo-kata-67ccb48d7b-wpwnq 1/1 Running 0 69s
[root@podvm-echo-kata-67ccb48d7b-wpwnq-6e216995 peerpod]# systemd-analyze time
Startup finished in 1.266s (kernel) + 3.378s (initrd) + 47.053s (userspace) = 51.698s
multi-user.target reached after 29.136s in userspace
[root@podvm-echo-kata-67ccb48d7b-wpwnq-6e216995 peerpod]# systemd-analyze blame
24.677s cloud-init.service
18.057s kdump.service
3.278s NetworkManager-wait-online.service
2.283s cloud-init-local.service
1.062s initrd-switch-root.service
773ms dracut-initqueue.service
[snip]
Disabling kdump.service from the pod VM image gave the following results:
[kcli@virtlab714 peerpods-files]$ oc get pods
NAME READY STATUS RESTARTS AGE
echo-kata-67ccb48d7b-gt9nv 1/1 Running 0 59s
[peerpod@podvm-echo-kata-67ccb48d7b-gt9nv-4b58c292 peerpod]# systemd-analyze time
Startup finished in 1.237s (kernel) + 3.378s (initrd) + 30.510s (userspace) = 35.126s
multi-user.target reached after 29.231s in userspace
[root@podvm-echo-kata-67ccb48d7b-gt9nv-4b58c292 peerpod]# systemd-analyze blame
24.898s cloud-init.service
3.361s NetworkManager-wait-online.service
2.264s cloud-init-local.service
1.001s initrd-switch-root.service
776ms dracut-initqueue.service
Ubuntu 20.04 cloud image
$ oc get pods
NAME READY STATUS RESTARTS AGE
echo-kata-67ccb48d7b-r4bcb 1/1 Running 0 70s
root@podvm-echo-kata-67ccb48d7b-r4bcb-15a38e4a:/home/peerpod# systemd-analyze time
Startup finished in 4.150s (kernel) + 14.571s (userspace) = 18.721s
graphical.target reached after 13.728s in userspace
root@podvm-echo-kata-67ccb48d7b-r4bcb-15a38e4a:/home/peerpod# systemd-analyze blame
41.390s apt-daily.service
3.082s snapd.service
3.029s systemd-networkd-wait-online.service
2.803s snap.lxd.activate.service
2.624s cloud-init-local.service
1.549s cloud-init.service
1.414s cloud-config.service
1.383s apt-daily-upgrade.service
1.234s dev-sda1.device
1.222s ua-timer.service
Disabling apt-daily.service gave the following results:
$ oc get pods
NAME READY STATUS RESTARTS AGE
echo-kata-67ccb48d7b-q4q8r 1/1 Running 0 39s
root@podvm-echo-kata-67ccb48d7b-q4q8r-faf29ce7:/home/peerpod# systemd-analyze time
Startup finished in 4.162s (kernel) + 14.206s (userspace) = 18.369s
graphical.target reached after 13.457s in userspace
root@podvm-echo-kata-67ccb48d7b-q4q8r-faf29ce7:/home/peerpod# systemd-analyze blame
3.025s systemd-networkd-wait-online.service
2.683s snap.lxd.activate.service
2.363s cloud-init-local.service
2.269s snapd.service
1.848s cloud-init.service
@yoheiueda @mkulke
To make peer-pods approach a viable solution for serverless computing, startup times matter. Even otherwise the startup time should be comparable to using Kata with local hypervisor (qemu/clh).
This is a tracker issue to document findings and identify scope of improvement.