neondatabase / autoscaling

Postgres vertical autoscaling in k8s
Apache License 2.0
152 stars 21 forks source link

Merge auxillary containers to reduce pressure on container runtime #747

Closed Omrigan closed 6 months ago

Omrigan commented 8 months ago

https://github.com/neondatabase/autoscaling/issues/711#issuecomment-1893637974 suggests there is no system in how long every container takes to be created/started.

This can be explained by the fact that containerd has to create and run each container. In the peaks it might be overwhelmed, thus sporadic queues might occur.

The idea is to reduce the number of containers. In the best case we need only one - main container with QEMU.

sharnoff commented 8 months ago

A couple notes:

  1. IIRC there's three reasons we require init containers:
    1. To make image loading easier (see init-rootdisk, or [when using a custom kernel] init-kernel)
    2. To perform privileged operations without giving neonvm-runner those privileges (see neonvm-runner)
    3. For setting up iptables rules externally (cplane injects an extra init container to do this, because (a) it's not natively supported by neonvm, and (b) using an init container means the process is the same for k8s-pod computes)
  2. Given this, there's several reasons in practice we may need to keep at least one init container. But I'm generally +1 to reducing the number of init containers where we're able to.
Omrigan commented 7 months ago

As the result of discussions with @sharnoff and @ololobus I intend to do the following changes:

  1. Merge init-rootdisk and sysctl containers (#769)
  2. Move iptables configuration logic from compute-init to neonvm-runner, thus dropping compute-init
  3. Get rid of the init container by: a. Merging compute and runner images, so there is no need to copy disk b. Configuring parent network namespace, so that sysctl is not needed
sharnoff commented 7 months ago

status: Two PRs are ready (#769 and #782), need review on those. Planning to open a follow-up to #782 to refactor neonvm-runner.

sharnoff commented 7 months ago

status: #769 and #782 still pending, reviewed & need to reply. #790 opened and should help as well, WIP. Previously blocked on deploy of startup metrics; they aren't yet on prod, but should be this week.

Omrigan commented 7 months ago

769 is in production since yesterday, and the graph suggests 1s reduction it timings (at least for the fastest ones):

screenshot-2024-02-23_02-19-30_504387060

782 is merged, but the enablement on cplane side is WIP: https://github.com/neondatabase/cloud/pull/10630

790 is still WIP.