MatrixAI / Emergence

Distributed Infrastructure Orchestration
Apache License 2.0
1 stars 0 forks source link

Runtime dynamic resource constraints #47

Open nzhang-zh opened 5 years ago

nzhang-zh commented 5 years ago

We will need the capability to update resource constraint of automatons at runtime. Investigate how to achieve this with linux namespace API & control groups.

There are some existing demo with clone by @mokuki082 in container practices repo.

nzhang-zh commented 5 years ago

Here is a simple namespacing switching demo in fork and unshare style. e2751a70d379da7d52fbff8362dd0d0edf4eae5a

nzhang-zh commented 5 years ago

Resources constraints are set with nice value, POSIX rlimits and cgroups. Among these, only cgroups allow per-service resource constraints.

Resource constraints of a cgroup can be updated at runtime with /sys/fs/cgroup/ or via systemd's dbus API if applicable.

According to this, access cgroup via systemd seems like a better choice.

Previously, the kernel's cgroups API was exposed directly as shared application API, following the rules of the Pax Control Groups document. However, the kernel cgroup interface has been reworked into an API that requires that each individual cgroup is managed by a single writer only. With this change the main cgroup tree becomes private property of that userspace component and is no longer a shared resource. On systemd systems PID 1 takes this role and hence needs to provide APIs for clients to take benefit of the control groups functionality of the kernel. Note that services running on systemd systems may manage their own subtrees of the cgroups tree, as long as they explicitly turn on delegation mode for them (see below).

cgroup namespace provides isolation of the cgroup hierarchical view. Combined with proper mount settings, it provides

cgroup-namespaces(7) better confinement of containerized processes, because it is possible to mount the container's cgroup filesystems such that the container processes can't gain access to ancestor cgroup directories.

So cgroup namespace is not for setting cgroup memberships.

setns(2) Using setns() to change the caller's cgroup namespace does not change the caller's cgroup memberships.

Previously had some misunderstanding on the last point.

CMCDragonkai commented 5 years ago

Avoid systemd apis as they wont play nice with OCI containers.

nzhang-zh commented 5 years ago

runc/libcontainer supports resource constraints updates which means we might not need to directly manipulate cgroupfs ourselves.

Closing this for now until further when we want to manage cgroups ourselves or if runc's interface turns out to be not flexible enough for us.

nzhang-zh commented 5 years ago

Reopening this to do some experiment on how runtime resource constraints modification affects running container.