jupyterhub / systemdspawner

Spawn JupyterHub single-user notebook servers with systemd
BSD 3-Clause "New" or "Revised" License
91 stars 49 forks source link

Add per user swap limit #39

Open betatim opened 5 years ago

betatim commented 5 years ago

Currently when a user exceeds the memory limit set by systemdspawner their processes won't be killed, instead they can use swap. This is quite confusing for users as their jobs keep working but slow down a lot for no particular reason.

One example is #15 and links in it.

A comment there also points to an option that let's you limit how much swap a process can use.

The problem with this is that the current method for setting memory limits (MemoryLimit) is disabled if you set a swap limit. This means we would need to build something that detects that a user specified a swap limit and then changes over the names. Or maybe we migrate to the new names only. To decide this some input from someone with more systemd experience would be needed as the differences and prevalence of the v1 vs v2 cgroup limits isn't something I know much about.

mxjeff commented 5 years ago

Well I managed to limit|disable swapping (at least on Debian 10 aka Buster).
I had to boot in unified cgroups and enable swap accounting with the following kernel options:

systemd.unified_cgroup_hierarchy=1
swapaccount=1

On debian, configure grub:

# in /etc/default/grug
GRUB_CMDLINE_LINUX_DEFAULT="quiet systemd.unified_cgroup_hierarchy=1 swapaccount=1"

Run /usr/sbin/update-grub2, then reboot.

Then I wrote a dedicated slice to spawn jupyter-singleuser in. I can now move resources limitation mem_limit and cpu_limit in the slice definition, bonus, it exposes all systemd resource-control directives, not only mem and cpu limits.

Here is a slice example:

# /etc/systemd/system/jupyter-singleuser.slice
[Unit]
Description=Jupyter SingleUser Slice
Before=slices.target

[Slice]
MemoryAccounting=true
MemoryHigh=1G
MemorySwapMax=200M

Then, configure spawner to use it c.SystemdSpawner.slice=jupyter-singleuser.slice.

Even though this is working nice so far, I'm not sure yet of existing side effects of running without cgroups v1 compatibility.