canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.32k stars 925 forks source link

Pool performance on Assigin IPV4 address when concurrently launching containers #10142

Closed TommyLike closed 2 years ago

TommyLike commented 2 years ago

Required information

Issue description

When creating lxc containers concurrently, there would be a pool performance on assiging the IPv4 address of each containers let's take it to 20, there would be about 40-50 seconds delay after IPv6 address are fully assigned while IPv4 are emtpy.

+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
|                NAME                 |  STATE  | IPV4 |                     IPV6                      |   TYPE    | SNAPSHOTS |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| res44a3e1043840260d3a536dd88deee68a | STOPPED |      |                                               | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| res354ee1745cea9544612b9fa47fc24112 | STOPPED |      |                                               | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| resb2461b8e0881b7d733f960006bbcfa72 | STOPPED |      |                                               | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike1                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:feca:ec94 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike2                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fed2:2769 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike3                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fed8:dd9f (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike4                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe6f:dfd (eth0)  | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike5                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe55:7d1 (eth0)  | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike6                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fed4:14a2 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike7                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe25:ba24 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike8                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe79:b31b (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike9                          | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe21:11b6 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike10                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe72:c023 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike11                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe37:59e6 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike12                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe18:222c (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike13                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe91:e8d5 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike14                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe94:53c4 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike15                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe8b:5aa2 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike16                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe4a:94a9 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike17                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe77:50ad (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike18                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe12:317b (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike19                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe0e:efb8 (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+
| tommylike20                         | RUNNING |      | fd42:2c15:68be:30b4:216:3eff:fe48:798b (eth0) | CONTAINER | 0         |
+-------------------------------------+---------+------+-----------------------------------------------+-----------+-----------+

Steps to reproduce

  1. Step one: launch 20 ubuntu containers

Information to attach

stgraber commented 2 years ago

That's unlikely to be a LXD issue.

IPv4 only gets assigned once the container has started up enough to run systemd-networkd and have it run the DHCP client. This always takes a little while and likely quite a bit longer on a busy system such like the one you're describing.

IPv6 is instead handled directly by the kernel so doesn't need anything from the container itself and will appear almost immediately.

TommyLike commented 2 years ago

That's unlikely to be a LXD issue.

IPv4 only gets assigned once the container has started up enough to run systemd-networkd and have it run the DHCP client. This always takes a little while and likely quite a bit longer on a busy system such like the one you're describing.

IPv6 is instead handled directly by the kernel so doesn't need anything from the container itself and will appear almost immediately.

Apart from speeding up the container start up process, is there any other way to alleviate this? I mean it's much faster when fewer instances are launching. @stgraber

TommyLike commented 2 years ago

console log for my instance:

systemd v243-54.oe1 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=legacy)
Detected virtualization lxc.
Detected architecture x86-64.
Failed to create symlink /sys/fs/cgroup/cpu: File exists
Failed to create symlink /sys/fs/cgroup/net_cls: File exists

Welcome to openEuler 20.03 (LTS-SP3)!

Initializing machine ID from random generator.
Couldn't move remaining userspace processes, ignoring: Invalid argument
/usr/lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
dev-.lxd\x2dmounts.mount: unit configures an IP firewall, but the local system does not support BPF/cgroup firewalling.
(This warning is only shown for the first unit using IP firewalling.)
[  OK  ] Created slice system-getty.slice.
[  OK  ] Created slice User and Session Slice.
[  OK  ] Started Dispatch Password Requests to Console Directory Watch.
[  OK  ] Started Forward Password Requests to Wall Directory Watch.
[UNSUPP] Starting of Arbitrary Executable Fi…tem Automount Point not supported.
[  OK  ] Reached target Local Encrypted Volumes.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Remote File Systems.
[  OK  ] Reached target Slices.
[  OK  ] Reached target Swap.
[  OK  ] Listening on Process Core Dump Socket.
[  OK  ] Listening on initctl Compatibility Named Pipe.
[  OK  ] Listening on Journal Socket (/dev/log).
[  OK  ] Listening on Journal Socket.
[  OK  ] Listening on udev Control Socket.
[  OK  ] Listening on udev Kernel Socket.
         Mounting Temporary Directory (/tmp)...
         Starting Journal Service...
         Starting Remount Root and Kernel File Systems...
         Starting Apply Kernel Variables...
         Starting udev Coldplug all Devices...
[  OK  ] Started Journal Service.
[  OK  ] Mounted Temporary Directory (/tmp).
[  OK  ] Started Remount Root and Kernel File Systems.
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Started udev Coldplug all Devices.
         Starting Flush Journal to Persistent Storage...
         Starting Create System Users...
[  OK  ] Started Flush Journal to Persistent Storage.
[  OK  ] Started Create System Users.
         Starting Create Static Device Nodes in /dev...
[  OK  ] Started Create Static Device Nodes in /dev.
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
         Starting Restore /run/initramfs on shutdown...
         Starting Rebuild Dynamic Linker Cache...
         Starting Create Volatile Files and Directories...
         Starting udev Kernel Device Manager...
[  OK  ] Started udev Kernel Device Manager.
[  OK  ] Started Restore /run/initramfs on shutdown.
[  OK  ] Started Rebuild Dynamic Linker Cache.
[  OK  ] Started Create Volatile Files and Directories.
         Starting Security Auditing Service...
         Starting Rebuild Journal Catalog...
[FAILED] Failed to start Security Auditing Service.
See 'systemctl status auditd.service' for details.
[  OK  ] Started Rebuild Journal Catalog.
         Starting Update is Completed...
         Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started Update is Completed.
[  OK  ] Stopped Security Auditing Service.
         Starting Security Auditing Service...
[FAILED] Failed to start Security Auditing Service.
See 'systemctl status auditd.service' for details.
[  OK  ] Started Update UTMP about System Boot/Shutdown.
[  OK  ] Stopped Security Auditing Service.
         Starting Security Auditing Service...
[FAILED] Failed to start Security Auditing Service.
See 'systemctl status auditd.service' for details.
[  OK  ] Reached target System Initialization.
[  OK  ] Started dnf makecache --timer.
[  OK  ] Started Daily Cleanup of Temporary Directories.
[  OK  ] Reached target Timers.
[  OK  ] Listening on D-Bus System Message Bus Socket.
[  OK  ] Reached target Sockets.
[  OK  ] Reached target Basic System.
[  OK  ] Started D-Bus System Message Bus.
         Starting Update RTC With System Clock...
         Starting LSB: Bring up/down networking...
         Starting Login Service...
[  OK  ] Started Update RTC With System Clock.
[  OK  ] Stopped Security Auditing Service.
         Starting Security Auditing Service...
[  OK  ] Started Login Service.
         Starting Hostname Service...
[FAILED] Failed to start Security Auditing Service.
See 'systemctl status auditd.service' for details.
[  OK  ] Started LSB: Bring up/down networking.
[  OK  ] Started Hostname Service.
[  OK  ] Reached target Network.
[  OK  ] Reached target Network is Online.
         Starting System Logging Service...
         Starting Permit User Sessions...
[  OK  ] Started Permit User Sessions.
[  OK  ] Stopped Security Auditing Service.
         Starting Security Auditing Service...
[  OK  ] Started Console Getty.
[  OK  ] Reached target Login Prompts.
[FAILED] Failed to start Security Auditing Service.
See 'systemctl status auditd.service' for details.
[  OK  ] Started System Logging Service.
[  OK  ] Started Command Scheduler.
[  OK  ] Reached target Multi-User System.
[  OK  ] Reached target Graphical Interface.
         Starting Update UTMP about System Runlevel Changes...
[  OK  ] Stopped Security Auditing Service.
         Starting Security Auditing Service...
[  OK  ] Started Update UTMP about System Runlevel Changes.
[FAILED] Failed to start Security Auditing Service.
See 'systemctl status auditd.service' for details.
stgraber commented 2 years ago

You can so some tweaks to your image to have less things start on boot. If using Ubuntu, you may want to try images:ubuntu/20.04 instead of ubuntu:20.04 as the images: (community image) is lighter and doesn't run cloud-init on first boot which may save you some time.

But overall, I'd expect it to be I/O and CPU contention slowing things down on startup which isn't something that we can really do anything about on the LXD side of things.