bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.59k stars 506 forks source link

Possible Containerd Startup Latency Increase #4163

Open strategicpause opened 3 weeks ago

strategicpause commented 3 weeks ago

Image I'm using: aws-dev variant using core kit v2.3.5. Built using the following commands:

What I expected to happen: I expect containerd to start up in ~1.5s after running /bin/systemctl try-restart containerd.service.

What actually happened: It takes roughly ~6s for containerd to start.

How to reproduce the problem:

  1. Build the AMI:
    $ git clone https://github.com/bottlerocket-os/bottlerocket
    $ cd bottlerocket
    $ cargo make -e ARCH="x86_64" -e VARIANT="aws-dev" -e PUBLISH_REGIONS="us-west-2" build
    $ cargo make -e ARCH="x86_64" -e PUBLISH_AMI_NAME="my-dev-ami" -e VARIANT="aws-dev" -e PUBLISH_REGIONS="us-west-2" ami
  2. Start an EC2 Instance
  3. Run systemd-analyze blame
    bash-5.2# systemd-analyze
    Startup finished in 3.277s (kernel) + 29.099s (userspace) = 32.376s 
    multi-user.target reached after 29.098s in userspace.
    bash-5.2# systemd-analyze blame
    5.670s host-containerd.service
    4.083s docker.service
    3.706s soci-snapshotter.service
    2.220s sundog.service
    1.976s early-boot-config.service
    1.455s settings-applier.service
    1.129s dev-nvme0n1p3.device
    998ms dev-dm\x2d0.device
  4. When looking at the journal logs, there seems to be a 6s delay between when containerd.service is started and when the daemon itself starts. I'll included the relevant journal logs below. Note that containerd only takes 60ms to load, which is expected.
    Aug 23 20:20:24 localhost thar-be-settings[1063]: 20:20:24 [INFO] restart commands ["/bin/systemctl try-restart host-containerd.service"]
    Aug 23 20:20:24 localhost thar-be-settings[1063]: 20:20:24 [INFO] restart commands ["/bin/systemctl try-restart containerd.service"]
    Aug 23 20:20:24 localhost thar-be-settings[1063]: 20:20:24 [INFO] restart commands ["/usr/bin/systemctl try-restart systemd-modules-load"]
    Aug 23 20:20:24 localhost systemd[1]: systemd-modules-load.service: Deactivated successfully.
    Aug 23 20:20:24 localhost systemd[1]: Stopped Load Kernel Modules.
    Aug 23 20:20:24 localhost systemd[1]: Stopping Load Kernel Modules...
    Aug 23 20:20:25 localhost systemd[1]: Starting Load Kernel Modules...
    Aug 23 20:20:25 localhost systemd[1]: Finished Load Kernel Modules.
    Aug 23 20:20:25 localhost thar-be-settings[1063]: 20:20:25 [INFO] restart commands ["/usr/bin/corndog lockdown"]
    Aug 23 20:20:25 localhost thar-be-settings[1063]: 20:20:25 [INFO] restart commands ["netdog write-resolv-conf"]
    Aug 23 20:20:25 localhost systemd[1]: Reloading Network Configuration...
    Aug 23 20:20:25 localhost systemd[1]: Stopping Network Name Resolution...
    Aug 23 20:20:25 localhost systemd[1]: Finished Applies settings to create config files.
    Aug 23 20:20:25 localhost systemd[1]: Started Send signal to CloudFormation Stack.
    Aug 23 20:20:25 localhost systemd[1]: Starting Sets the hostname...
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Finished Sets the hostname.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Reached target Bottlerocket initial configuration complete.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Starting Isolates configured.target...
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemctl[1100]: Created symlink /etc/systemd/system/default.target → /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/configured.target.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Reloading.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-networkd[985]: eth0: found matching network '/etc/systemd/network/10-eth0.network', based on potentially unpredictable interface name.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-networkd[985]: eth0: Reconfiguring with /etc/systemd/network/10-eth0.network.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-networkd[985]: eth0: DHCPv4 connection considered critical, ignoring request to reconfigure it.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-networkd[985]: eth0: DHCPv6 lease lost
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-networkd[985]: eth0: DHCPv4 address 10.192.0.240/24, gateway 10.192.0.1 acquired from 10.192.0.1
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: systemd-resolved.service: Deactivated successfully.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Stopped Network Name Resolution.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Reloaded Network Configuration.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Starting Network Name Resolution...
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Started A versatile implementation of the Network Time Protocol.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Disables UDP offload was skipped because of an unmet condition check (ConditionVirtualization=vmware).
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Starting containerd runtime for host containers...
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Platform Persistent Storage Archival was skipped because of an unmet condition check (ConditionDirectoryNotEmpty=/sys/fs/pstore).
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Finished Isolates configured.target.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal chronyd[1111]: 2024-08-23T20:20:25Z chronyd version 4.5 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER -SIGND +ASYNCDNS -NTS -SECHASH +IPV6 -DEBUG)
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal chronyd[1111]: 2024-08-23T20:20:25Z Loaded seccomp filter (level 1)
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-resolved[1109]: Positive Trust Anchors:
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-resolved[1109]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-resolved[1109]: Negative trust anchors: home.arpa 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal intranet lan local private test
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd-resolved[1109]: Using system hostname 'ip-10-192-0-240.us-west-2.compute.internal'.
    Aug 23 20:20:25 ip-10-192-0-240.us-west-2.compute.internal systemd[1]: Started Network Name Resolution.
    Aug 23 20:20:29 ip-10-192-0-240.us-west-2.compute.internal chronyd[1111]: 2024-08-23T20:20:29Z Selected source 169.254.169.123
    Aug 23 20:20:31 ip-10-192-0-240.us-west-2.compute.internal containerd[1112]: time="2024-08-23T20:20:31.046799766Z" level=info msg="starting containerd" revision=8fc6bcff51318944179630522a095cc9dbf9f353 version=1.7.20+bottlerocket
    [...]
    Aug 23 20:20:31 ip-10-192-0-240.us-west-2.compute.internal containerd[1112]: time="2024-08-23T20:20:31.105048029Z" level=info msg="containerd successfully booted in 0.060867s"
strategicpause commented 3 weeks ago

Not sure if this is helpful, but this is the same command when evaluating Bottlerocket a couple years ago. It's pretty old, but at least shows some precedence for faster boot up times.

bash-5.1# systemd-analyze
Startup finished in 2.015s (kernel) + 7.070s (userspace) = 9.086s
multi-user.target reached after 7.069s in userspace

bash-5.1# systemd-analyze blame
1.839s wicked.service
 538ms settings-applier.service
 467ms docker.service
 439ms host-containerd.service
bcressey commented 2 weeks ago

multi-user.target reached after 29.098s in userspace.

This seems like a significant outlier for an aws-dev boot. At the other extreme, under KVM locally, I get numbers like this:

bash-5.2# systemd-analyze
Startup finished in 616ms (kernel) + 2.226s (userspace) = 2.842s
multi-user.target reached after 2.226s in userspace.

bash-5.2# systemd-analyze blame
443ms systemd-networkd-wait-online.service
331ms docker.service
265ms dev-dm\x2d0.device
261ms dev-vda3.device
229ms load-crash-kernel.service
167ms systemd-udev-trigger.service
141ms systemd-resolved.service
137ms activate-configured.service
132ms systemd-journald.service
122ms soci-snapshotter.service
117ms systemd-logind.service
111ms settings-applier.service
103ms containerd.service

There may be some combination of CPU or I/O throttling at play in the 29 second case.

Some data points for a t3.small launch in EC2:

multi-user.target reached after 7.168s in userspace.
docker.service +1.132s
└─containerd.service @5.675s +315ms
  └─soci-snapshotter.service @5.208s +414ms

multi-user.target reached after 9.413s in userspace.
docker.service +1.525s
└─containerd.service @6.797s +1.034s
  └─soci-snapshotter.service @5.769s +1.002s

multi-user.target reached after 7.835s in userspace.
docker.service +1.065s
└─containerd.service @6.488s +227ms
  └─soci-snapshotter.service @6.061s +383ms

The recent addition of the SOCI snapshotter dependency definitely delays the containerd start, but overall the numbers shouldn't be too far off your older sample.