pmores commented 5 months ago

Description of problem

I run a single kata container with a 5G memory limit (ctr run --memory-limit 5368709120) while having

[hypervisor.qemu]
default_memory = 2048

in my configuration.toml.

Expected result

I expected to get a qemu VM with 2G of coldplugged memory and a call to Qemu::resize_memory() to handle the memory limit.

Actual result

I got a qemu VM with 12G of coldplugged memory and a resize_memory() call to resize memory to 12288 (12G).

Further information

The coldplugged memory size seems to come from InitialSizeManager implementation which, at least in a single container case, adds the memory limit size to default_memory and stores the result back in the config. This also happens twice since runtime-rs appears to be processing a Create request twice(*).

The double call along with the non-idempotency of InitialSizeManager seems to explain the 12G of coldplugged memory (qemu sees 2G + 5G + 5G = 12G in default_memory in hypervisor config).

I have no idea about the resize_memory() to resize memory to the amount it already has.

(*) Relevant part of the callstack (same in both calls):

 { fn: "resource::cpu_mem::initial_size::InitialSizeManager::setup_config", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/resource/src/cpu_mem/initial_size.rs", line: 102 },
 { fn: "runtimes::manager::RuntimeHandlerManagerInner::try_init::{{closure}}::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/runtimes/src/manager.rs", line: 184 },
 { fn: "runtimes::manager::RuntimeHandlerManagerInner::try_init::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/runtimes/src/manager.rs", line: 144 },
 { fn: "runtimes::manager::RuntimeHandlerManager::try_init_runtime_instance::{{closure}}::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/runtimes/src/manager.rs", line: 353 },
 { fn: "runtimes::manager::RuntimeHandlerManager::try_init_runtime_instance::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/runtimes/src/manager.rs", line: 345 },
 { fn: "runtimes::manager::RuntimeHandlerManager::handler_message::{{closure}}::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/runtimes/src/manager.rs", line: 376 },
 { fn: "runtimes::manager::RuntimeHandlerManager::handler_message::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/runtimes/src/manager.rs", line: 356 },
 { fn: "service::task_service::TaskService::handler_message::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/service/src/task_service.rs", line: 45 },
 { fn: "<service::task_service::TaskService as containerd_shim_protos::shim::shim_ttrpc_async::Task>::create::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/crates/service/src/task_service.rs", line: 65 },
 { fn: "<core::pin::Pin<P> as core::future::future::Future>::poll", file: "/rustc/5680fa18feaa87f3ff04063800aec256c3d4b4be/library/core/src/future/future.rs", line: 125 },
 { fn: "<containerd_shim_protos::shim::shim_ttrpc_async::CreateMethod as ttrpc::asynchronous::utils::MethodHandler>::handler::{{closure}}", file: "/home/pvl/src/work/kata-pmores/kata-containers/src/runtime-rs/target/debug/build/containerd-shim-protos-f9b1bc737a552e2a/out/shim_async/shim_ttrpc.rs", line: 128 },

pmores commented 5 months ago

@kata-containers/architecture-committee Could you guys help me tag folks who have knowledge of these parts of runtime-rs and might be able to offer insights?

pmores commented 5 months ago

I verified that when I try the same (analogous) thing with a pod (just using the io.kubernetes.cri.sandbox-memory annotation now) I get the expected result - the VM starts with 2G of coldplugged memory, followed by a resize_memory() call with 7168, or 7G (2G + 5G), as the argument.

lifupan commented 5 months ago

Hi @pmores

I'll take it.

pmores commented 5 months ago

Thanks a lot @lifupan , please tag me if you submit a PR!

pmores commented 5 months ago

I verified that when I try the same (analogous) thing with a pod (just using the io.kubernetes.cri.sandbox-memory annotation now) I get the expected result - the VM starts with 2G of coldplugged memory, followed by a resize_memory() call with 7168, or 7G (2G + 5G), as the argument.

Hm, as an update after observing this some more, it's however possible even in the pod scenario to end up with 12G (default_memory + 2 * sandbox-memory) of coldplugged memory. Then the Hypervisor get a resize_memory() to 7G.

I'm increasingly confused. :-)

lifupan commented 4 months ago

Hi @pmores

Can you double check whether this PR. https://github.com/kata-containers/kata-containers/pull/9836 fixed your issue?

kata-containers / kata-containers

runtime-rs: running a container with ctr and --memory-limit #9778

Description of problem

Expected result

Actual result

Further information