hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.91k stars 1.95k forks source link

cpuset: `no space left on device` #23405

Open rodrigol-chan opened 4 months ago

rodrigol-chan commented 4 months ago

Nomad version

Nomad v1.7.7
BuildDate 2024-04-16T19:26:43Z
Revision 0f34c85ee63f6472bd2db1e2487611f4b176c70c

Operating system and Environment details

Running Ubuntu 22.04 on Google Cloud in an n2d-standard-32 instance.

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Issue

Alerts fired due to failed allocations. Upon investigation, I noticed the following log line:

{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T09:58:45.224880+02:00","alloc_id":"84407c34-35e1-b0cb-4a8f-f3b6c9a8cc81","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}

Also interesting to observe is that, unlike in our other 1.7.x clients, there's overlap between the CPUs for the reserve and share slices:

$ head /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus /sys/fs/cgroup/nomad.slice/share.slice/cpuset.cpus
==> /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus <==
0-3

==> /sys/fs/cgroup/nomad.slice/share.slice/cpuset.cpus <==
0-31

Reproduction steps

Not clear how to reproduce. This happened on a single instance. All allocations that failed are from periodic jobs, running with on the exec driver with no core constraints.

Expected Result

Allocations spawn successfully.

Actual Result

Allocations failed to spawn.

Nomad Client logs (if appropriate)

{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T09:56:52.170066+02:00","alloc_id":"7a39078a-0769-d3e9-38a0-c706ff516de8","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T09:58:45.224880+02:00","alloc_id":"84407c34-35e1-b0cb-4a8f-f3b6c9a8cc81","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T09:58:45.224934+02:00","alloc_id":"84407c34-35e1-b0cb-4a8f-f3b6c9a8cc81","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T09:58:45.226936+02:00","alloc_id":"84407c34-35e1-b0cb-4a8f-f3b6c9a8cc81","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T09:58:45.228745+02:00","alloc_id":"84407c34-35e1-b0cb-4a8f-f3b6c9a8cc81","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"processing","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T09:58:45.241815+02:00","alloc_id":"84407c34-35e1-b0cb-4a8f-f3b6c9a8cc81","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T09:59:45.699245+02:00","alloc_id":"c506511a-eb17-73ee-7164-aaa9df390f47","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T09:59:45.699331+02:00","alloc_id":"c506511a-eb17-73ee-7164-aaa9df390f47","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"processing","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T09:59:45.701453+02:00","alloc_id":"c506511a-eb17-73ee-7164-aaa9df390f47","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T09:59:45.703532+02:00","alloc_id":"c506511a-eb17-73ee-7164-aaa9df390f47","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T09:59:45.717471+02:00","alloc_id":"c506511a-eb17-73ee-7164-aaa9df390f47","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:00.588677+02:00","alloc_id":"69725229-e961-9496-917c-d81b21aac9d8","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:00.588716+02:00","alloc_id":"69725229-e961-9496-917c-d81b21aac9d8","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"maintenance-timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:00.594967+02:00","alloc_id":"69725229-e961-9496-917c-d81b21aac9d8","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:00.599704+02:00","alloc_id":"69725229-e961-9496-917c-d81b21aac9d8","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:00.617996+02:00","alloc_id":"69725229-e961-9496-917c-d81b21aac9d8","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:00.944803+02:00","alloc_id":"ac468e4e-a551-2991-a4fc-b8fa64599552","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:00.944850+02:00","alloc_id":"ac468e4e-a551-2991-a4fc-b8fa64599552","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:00.950686+02:00","alloc_id":"ac468e4e-a551-2991-a4fc-b8fa64599552","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:00.954145+02:00","alloc_id":"ac468e4e-a551-2991-a4fc-b8fa64599552","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:00.971800+02:00","alloc_id":"ac468e4e-a551-2991-a4fc-b8fa64599552","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:01.280014+02:00","alloc_id":"4b3df1d6-87a0-b0f6-38c6-b4e305d1f3da","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:01.280062+02:00","alloc_id":"4b3df1d6-87a0-b0f6-38c6-b4e305d1f3da","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"maintenance-timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:01.284932+02:00","alloc_id":"4b3df1d6-87a0-b0f6-38c6-b4e305d1f3da","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:01.290230+02:00","alloc_id":"4b3df1d6-87a0-b0f6-38c6-b4e305d1f3da","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:01.305828+02:00","alloc_id":"4b3df1d6-87a0-b0f6-38c6-b4e305d1f3da","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:01.604525+02:00","alloc_id":"7fa1997a-9aa0-50cf-bb01-85168b5e7ba0","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:01.604609+02:00","alloc_id":"7fa1997a-9aa0-50cf-bb01-85168b5e7ba0","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:01.609268+02:00","alloc_id":"7fa1997a-9aa0-50cf-bb01-85168b5e7ba0","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:00:01.615208+02:00","alloc_id":"7fa1997a-9aa0-50cf-bb01-85168b5e7ba0","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:00:01.625727+02:00","alloc_id":"7fa1997a-9aa0-50cf-bb01-85168b5e7ba0","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:01:00.615323+02:00","alloc_id":"727d4eeb-b135-3a31-1765-133146f2cd7f","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:01:00.615380+02:00","alloc_id":"727d4eeb-b135-3a31-1765-133146f2cd7f","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:01:00.617829+02:00","alloc_id":"727d4eeb-b135-3a31-1765-133146f2cd7f","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:01:00.619727+02:00","alloc_id":"727d4eeb-b135-3a31-1765-133146f2cd7f","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:01:00.633596+02:00","alloc_id":"727d4eeb-b135-3a31-1765-133146f2cd7f","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:02:00.614056+02:00","alloc_id":"e1935c49-11a0-579c-5ab5-244bc931f0ad","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:02:00.614102+02:00","alloc_id":"e1935c49-11a0-579c-5ab5-244bc931f0ad","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:02:00.615991+02:00","alloc_id":"e1935c49-11a0-579c-5ab5-244bc931f0ad","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-06-21T10:02:00.617758+02:00","alloc_id":"e1935c49-11a0-579c-5ab5-244bc931f0ad","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-06-21T10:02:00.629413+02:00","alloc_id":"e1935c49-11a0-579c-5ab5-244bc931f0ad","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}

Nomad client configuration

client {
  gc_max_allocs = 300
  gc_disk_usage_threshold = 80
}
tgross commented 4 months ago

Hi @rodrigol-chan! Sorry to hear you're running into trouble. The error you're getting here is particularly weird:

write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device

We're writing to the /sys/fs/cgroup mount, which is a virtual file system! The only way I can think of for this to happen is if we've written a ton of inodes to the cgroup and haven't been cleaning them up correctly. What do you get if you cat the /proc/cgroups virtual file?

$ cat /proc/cgroups
#subsys_name    hierarchy       num_cgroups     enabled
cpuset  0       230     1
cpu     0       230     1
cpuacct 0       230     1
blkio   0       230     1
memory  0       230     1
devices 0       230     1
freezer 0       230     1
net_cls 0       230     1
perf_event      0       230     1
net_prio        0       230     1
hugetlb 0       230     1
pids    0       230     1
rdma    0       230     1
misc    0       230     1
rodrigol-chan commented 4 months ago

It happened again just now, on a different machine.

$ cat /proc/cgroups
#subsys_name    hierarchy       num_cgroups     enabled
cpuset  0       3953    1
cpu     0       3953    1
cpuacct 0       3953    1
blkio   0       3953    1
memory  0       3953    1
devices 0       3953    1
freezer 0       3953    1
net_cls 0       3953    1
perf_event      0       3953    1
net_prio        0       3953    1
hugetlb 0       3953    1
pids    0       3953    1
rdma    0       3953    1
misc    0       3953    1

This Nomad client configuration now looks relevant:

client {
  gc_max_allocs = 300
  gc_disk_usage_threshold = 80
}

And currently have over 300 allocations:

$ sudo ls -1 /var/lib/nomad/alloc | wc -l
374

Nomad seems to be keeping a lot of tmpfss around even if the allocations aren't running anymore. I'm not sure if that's by design.

$ df -t tmpfs | wc -l
407
rodrigol-chan commented 4 months ago

For extra context: issue seems new with the 1.7.x upgrade. We've run this configuration in 1.6.x for about 8 months with no similar issues.

tgross commented 4 months ago

Thanks for that extra info @rodrigol-chan. Even with that large number of allocs, I'd think you'd be ok until you get to 65535 inodes. I'll dig into that a little further to see if there's some more /sys or /proc filesystem spelunking we can do here.

Nomad seems to be keeping a lot of tmpfss around even if the allocations aren't running anymore. I'm not sure if that's by design.

The mounts are left in place until the allocation is GC'd on the client. We do that so that you can debug failed allocations.

rodrigol-chan commented 1 month ago

The issue still happens as of 1.8.3. Is there anything we can do to help troubleshoot this?

tgross commented 1 month ago

Hi @rodrigol-chan, sorry, I haven't been able to circle back to this and I'm currently swamped trying to land some work for our 1.9 beta next week.

I suspect this is platform-specific. I think you'll want to look into whether there's anything in the host configuration that could be limiting the size of those virtual FS directories.

tgross commented 1 month ago

Hi @rodrigol-chan! Just wanted to check in so you don't think I've forgotten this issue. I re-read through your initial report to see if there were any clues I missed.

Also interesting to observe is that, unlike in our other 1.7.x clients, there's overlap between the CPUs for the reserve and share slices:

Even ignoring the errors you're seeing, that's got to be a bug all by itself. These should never overlap. Even though we can't write to the two files atomically, we always remove from the source first and then write to the destination. So in that tiny race you should see a missing CPU but not one counted twice. So I'll look into seeing if I can find any place where there's potentially another race condition here where that's not correctly handled.

All allocations that failed are from periodic jobs, running with on the exec driver with no core constraints.

You have other allocations on the same host that do use core constraints though? If not, we're writing an empty value to the cgroup. In which case, I found this Stack Exchange post which describes that scenario, but has no answer. :facepalm:

I managed to dig up a few old issues that suggest that if cpuset.mem doesn't exist in the cgroup directory, then you can't write to cpuset.cpus either, but I also can't create a scenario where it wouldn't exist. Just creating a new directory with something like mkdir /sys/fs/cgroup/nomad.slice/new.slice makes it show up for me and you can't remove it.

Also, I wanted to see if I could get this error outside of Nomad by echoing a bad input to the cgroup file, and wasn't able to get that same error.

input result error
" " unset -
"" unset -
-1 - write error: Invalid argument
2 - write error: Numerical result out of range
2-1 - write error: Invalid argument
a - write error: Invalid argument
,0 0 -
1, 1 -
0-0 0 -
0-a - write error: Invalid argument

I did get some interesting (but different) errors trying to write to the nomad.slice/cpuset.cpus

# cat /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus
1
# cat /sys/fs/cgroup/nomad.slice/cpuset.cpus
0
# cat /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus
1
# echo 0 > /sys/fs/cgroup/nomad.slice/cpuset.cpus
bash: echo: write error: Device or resource busy

One more thing I'd like you to try is the following, to make sure we've counted the cgroups correctly when trying to figure out if its the inodes issue:

# find /sys/fs/cgroup/ | wc -l
2965
# find /sys/fs/cgroup/nomad.slice | wc -l
147
rodrigol-chan commented 1 month ago

You have other allocations on the same host that do use core constraints though?

That's correct.

One more thing I'd like you to try is the following, to make sure we've counted the cgroups correctly when trying to figure out if its the inodes issue.

Just happened again:

# find /sys/fs/cgroup -depth -type d | wc -l
81
# find /sys/fs/cgroup/nomad.slice -depth -type d | wc -l
29
# head /sys/fs/cgroup/nomad.slice/cpuset.cpus /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus /sys/fs/cgroup/nomad.slice/share.slice/cpuset.cpus
==> /sys/fs/cgroup/nomad.slice/cpuset.cpus <==
0-31                          

==> /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus <==
0-3

==> /sys/fs/cgroup/nomad.slice/share.slice/cpuset.cpus <==
4-31

Log output:

{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:26:17.441673+02:00","alloc_id":"ff98bb16-7a4e-2b4d-f8d6-584d767dd1bf","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:26:18.425610+02:00","alloc_id":"682bf2a1-d3d9-417e-1325-c4e2ffc56185","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:27:00.871024+02:00","alloc_id":"d6108b42-84de-5da3-0fd7-f902107d6069","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:27:00.871102+02:00","alloc_id":"d6108b42-84de-5da3-0fd7-f902107d6069","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:27:00.873250+02:00","alloc_id":"d6108b42-84de-5da3-0fd7-f902107d6069","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:27:00.875331+02:00","alloc_id":"d6108b42-84de-5da3-0fd7-f902107d6069","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:27:00.891020+02:00","alloc_id":"d6108b42-84de-5da3-0fd7-f902107d6069","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:28:00.461064+02:00","alloc_id":"c4f3dcf4-dfbd-6a10-8a4a-59e7dd2c7cf2","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:28:00.461112+02:00","alloc_id":"c4f3dcf4-dfbd-6a10-8a4a-59e7dd2c7cf2","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:28:00.463403+02:00","alloc_id":"c4f3dcf4-dfbd-6a10-8a4a-59e7dd2c7cf2","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:28:00.465462+02:00","alloc_id":"c4f3dcf4-dfbd-6a10-8a4a-59e7dd2c7cf2","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:28:04.872027+02:00","alloc_id":"c4f3dcf4-dfbd-6a10-8a4a-59e7dd2c7cf2","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:28:51.538329+02:00","alloc_id":"8c0fbbff-6823-c309-e64f-5a107fad5f9b","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:28:51.538372+02:00","alloc_id":"8c0fbbff-6823-c309-e64f-5a107fad5f9b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"hulppiet-processing","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:28:51.540765+02:00","alloc_id":"8c0fbbff-6823-c309-e64f-5a107fad5f9b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:28:51.543117+02:00","alloc_id":"8c0fbbff-6823-c309-e64f-5a107fad5f9b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:28:51.559830+02:00","alloc_id":"8c0fbbff-6823-c309-e64f-5a107fad5f9b","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:29:33.656064+02:00","alloc_id":"aafaa858-254c-b7c1-608d-b475eac076df","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:00.186942+02:00","alloc_id":"fcf42431-4866-956f-0fd6-cfb3bb0bc6f7","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.186998+02:00","alloc_id":"fcf42431-4866-956f-0fd6-cfb3bb0bc6f7","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.189414+02:00","alloc_id":"fcf42431-4866-956f-0fd6-cfb3bb0bc6f7","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.191631+02:00","alloc_id":"fcf42431-4866-956f-0fd6-cfb3bb0bc6f7","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:00.616165+02:00","alloc_id":"4104ffa6-61d6-bd2b-c4c6-16dc67fc6101","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.616259+02:00","alloc_id":"4104ffa6-61d6-bd2b-c4c6-16dc67fc6101","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"sharkmachine-db-maintenance","type":"Setup Failure"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:00.618142+02:00","alloc_id":"e5fea182-c573-1165-beef-1bb28b54457b","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.618185+02:00","alloc_id":"e5fea182-c573-1165-beef-1bb28b54457b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.622550+02:00","alloc_id":"4104ffa6-61d6-bd2b-c4c6-16dc67fc6101","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.624860+02:00","alloc_id":"e5fea182-c573-1165-beef-1bb28b54457b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"requestmachine-timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.630967+02:00","alloc_id":"4104ffa6-61d6-bd2b-c4c6-16dc67fc6101","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.633081+02:00","alloc_id":"e5fea182-c573-1165-beef-1bb28b54457b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:00.666757+02:00","alloc_id":"e5fea182-c573-1165-beef-1bb28b54457b","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:00.668936+02:00","alloc_id":"4104ffa6-61d6-bd2b-c4c6-16dc67fc6101","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:00.978237+02:00","alloc_id":"b4282dcd-db53-0b92-7460-4948414fdc46","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.978283+02:00","alloc_id":"b4282dcd-db53-0b92-7460-4948414fdc46","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.982879+02:00","alloc_id":"b4282dcd-db53-0b92-7460-4948414fdc46","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:00.989391+02:00","alloc_id":"b4282dcd-db53-0b92-7460-4948414fdc46","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:01.338338+02:00","alloc_id":"fb073ab4-0928-dd31-af16-9d09356df977","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:01.338406+02:00","alloc_id":"fb073ab4-0928-dd31-af16-9d09356df977","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"imaginator-maintenance-timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:01.342794+02:00","alloc_id":"fb073ab4-0928-dd31-af16-9d09356df977","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:01.349456+02:00","alloc_id":"fb073ab4-0928-dd31-af16-9d09356df977","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:01.373888+02:00","alloc_id":"fb073ab4-0928-dd31-af16-9d09356df977","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:01.698021+02:00","alloc_id":"29a30585-27f8-6fd0-710c-4ff80df5f7f7","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:01.698061+02:00","alloc_id":"29a30585-27f8-6fd0-710c-4ff80df5f7f7","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:01.702273+02:00","alloc_id":"29a30585-27f8-6fd0-710c-4ff80df5f7f7","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:30:01.707199+02:00","alloc_id":"29a30585-27f8-6fd0-710c-4ff80df5f7f7","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:01.722370+02:00","alloc_id":"29a30585-27f8-6fd0-710c-4ff80df5f7f7","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:04.591155+02:00","alloc_id":"fcf42431-4866-956f-0fd6-cfb3bb0bc6f7","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:30:05.415728+02:00","alloc_id":"b4282dcd-db53-0b92-7460-4948414fdc46","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:31:00.682071+02:00","alloc_id":"1acbe5ca-f674-40f5-2eff-9c19dbd388ed","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:31:00.682114+02:00","alloc_id":"1acbe5ca-f674-40f5-2eff-9c19dbd388ed","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:31:00.684033+02:00","alloc_id":"1acbe5ca-f674-40f5-2eff-9c19dbd388ed","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:31:00.686174+02:00","alloc_id":"1acbe5ca-f674-40f5-2eff-9c19dbd388ed","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:31:00.703343+02:00","alloc_id":"1acbe5ca-f674-40f5-2eff-9c19dbd388ed","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"error","@message":"prerun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:32:00.641821+02:00","alloc_id":"9bfa85eb-c32f-d688-33ed-eb2705f66a5b","error":"pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:32:00.641890+02:00","alloc_id":"9bfa85eb-c32f-d688-33ed-eb2705f66a5b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"timer","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:32:00.644153+02:00","alloc_id":"9bfa85eb-c32f-d688-33ed-eb2705f66a5b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"nix-setup-profiles","type":"Setup Failure"}
{"@level":"info","@message":"Task event","@module":"client.alloc_runner.task_runner","@timestamp":"2024-09-30T16:32:00.646231+02:00","alloc_id":"9bfa85eb-c32f-d688-33ed-eb2705f66a5b","failed":true,"msg":"failed to setup alloc: pre-run hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device","task":"promtail","type":"Setup Failure"}
{"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-09-30T16:32:00.663011+02:00","alloc_id":"9bfa85eb-c32f-d688-33ed-eb2705f66a5b","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"}

It doesn't look like the CPUs overlapped this time. The number of dying descendants is curious, I wonder if it's related:

# head /sys/fs/cgroup/nomad.slice/cgroup.stat /sys/fs/cgroup/nomad.slice/reserve.slice/cgroup.stat /sys/fs/cgroup/nomad.slice/share.slice/cgroup.stat 
==> /sys/fs/cgroup/nomad.slice/cgroup.stat <==
nr_descendants 28
nr_dying_descendants 2356

==> /sys/fs/cgroup/nomad.slice/reserve.slice/cgroup.stat <==
nr_descendants 1
nr_dying_descendants 78

==> /sys/fs/cgroup/nomad.slice/share.slice/cgroup.stat <==
nr_descendants 25
nr_dying_descendants 2278
tgross commented 1 month ago

Can you confirm whether the cpuset.mem file exists in the reserve.slice? And what's cgroup.max.descendants set to? Example:

$ cat /sys/fs/cgroup/nomad.slice/reserve.slice/cgroup.max.descendants
max
rodrigol-chan commented 1 month ago

And what's cgroup.max.descendants set to?

I did look at that at failure time and from memory it was at max.

Can you confirm whether the cpuset.mem file exists in the reserve.slice?

I can't find any cpuset.mem anywhere, did you mean cpuset.mems? This latter one is present and seems empty for all allocations as far as I can see. I somehow missed this, but this is indeed a machine with NUMA. We widely use memory blocks but none with numa configuration.

$ lsmem -o +NODE
RANGE                                 SIZE  STATE REMOVABLE  BLOCK NODE
0x0000000000000000-0x00000000bfffffff   3G online       yes    0-2    0
0x0000000100000000-0x000000103fffffff  61G online       yes   4-64    0
0x0000001040000000-0x000000203fffffff  64G online       yes 65-128    1

Memory block size:         1G
Total online memory:     128G
Total offline memory:      0B

I'll doublecheck cgroup.max.descendants and cgroup.mem{s,} as soon as it happens again and update this issue. Thanks again for looking into this!

rodrigol-chan commented 1 month ago

Just happened again. (It has been happening strangely often lately.) Here are the values requested. cpuset.mems always seems to be present and empty in all cgroups managed by Nomad.

$ head /sys/fs/cgroup/nomad.slice/reserve.slice/cgroup.max.descendants
max
$ head /sys/fs/cgroup/nomad.slice/reserve.slice/cgroup.stat
nr_descendants 1
nr_dying_descendants 11
$ head /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus
0-3
$ head /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.mems
$

This doesn't look like it should be possible, though:

$ head /sys/fs/cgroup/nomad.slice/cpuset.cpus
0-31
$ head /sys/fs/cgroup/nomad.slice/share.slice/cpuset.cpus
4-31
$ head /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus
0-3
$ head /sys/fs/cgroup/nomad.slice/reserve.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.realtime-gunicorn.scope/cpuset.cpus
4-7

It might just be an artifact of how the data is collected since I don't think it's possible to do an atomic snapshot of cgroups.

All Nomad cgroups ``` 2024-10-04 15:23:33.037 {"@level":"error","@message":"postrun failed","@module":"client.alloc_runner","@timestamp":"2024-10-04T15:23:33.037116+02:00","alloc_id":"a7afbf13-6bde-4d59-9dbf-671919ee2b3a","error":"hook \"cpuparts_hook\" failed: write /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus: no space left on device"} 2024-10-04 15:23:33.269 ==> /sys/fs/cgroup/nomad.slice/cgroup.max.descendants <== 2024-10-04 15:23:33.269 max 2024-10-04 15:23:33.269 ==> /sys/fs/cgroup/nomad.slice/cgroup.stat <== 2024-10-04 15:23:33.269 nr_descendants 28 2024-10-04 15:23:33.269 nr_dying_descendants 939 2024-10-04 15:23:33.269 ==> /sys/fs/cgroup/nomad.slice/cpuset.cpus <== 2024-10-04 15:23:33.269 0-31 2024-10-04 15:23:33.269 ==> /sys/fs/cgroup/nomad.slice/cpuset.mems <== 2024-10-04 15:23:33.272 ==> /sys/fs/cgroup/nomad.slice/share.slice/cgroup.max.descendants <== 2024-10-04 15:23:33.272 max 2024-10-04 15:23:33.272 ==> /sys/fs/cgroup/nomad.slice/share.slice/cgroup.stat <== 2024-10-04 15:23:33.272 nr_descendants 25 2024-10-04 15:23:33.272 nr_dying_descendants 928 2024-10-04 15:23:33.272 ==> /sys/fs/cgroup/nomad.slice/share.slice/cpuset.cpus <== 2024-10-04 15:23:33.272 4-31 2024-10-04 15:23:33.272 ==> /sys/fs/cgroup/nomad.slice/share.slice/cpuset.mems <== 2024-10-04 15:23:33.273 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.realtime-jobqueuerunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.273 max 2024-10-04 15:23:33.273 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.realtime-jobqueuerunner.scope/cgroup.stat <== 2024-10-04 15:23:33.273 nr_descendants 0 2024-10-04 15:23:33.273 nr_dying_descendants 0 2024-10-04 15:23:33.273 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.realtime-jobqueuerunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.273 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.realtime-jobqueuerunner.scope/cpuset.mems <== 2024-10-04 15:23:33.274 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.274 max 2024-10-04 15:23:33.274 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.274 nr_descendants 0 2024-10-04 15:23:33.274 nr_dying_descendants 0 2024-10-04 15:23:33.274 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.274 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.276 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.276 max 2024-10-04 15:23:33.276 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.276 nr_descendants 0 2024-10-04 15:23:33.276 nr_dying_descendants 0 2024-10-04 15:23:33.276 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.276 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.277 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.realtime-taskrunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.277 max 2024-10-04 15:23:33.277 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.realtime-taskrunner.scope/cgroup.stat <== 2024-10-04 15:23:33.277 nr_descendants 0 2024-10-04 15:23:33.277 nr_dying_descendants 0 2024-10-04 15:23:33.277 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.realtime-taskrunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.277 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.realtime-taskrunner.scope/cpuset.mems <== 2024-10-04 15:23:33.278 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.realtime-jobqueuerunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.278 max 2024-10-04 15:23:33.278 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.realtime-jobqueuerunner.scope/cgroup.stat <== 2024-10-04 15:23:33.278 nr_descendants 0 2024-10-04 15:23:33.278 nr_dying_descendants 0 2024-10-04 15:23:33.278 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.realtime-jobqueuerunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.278 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.realtime-jobqueuerunner.scope/cpuset.mems <== 2024-10-04 15:23:33.280 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.280 max 2024-10-04 15:23:33.280 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.280 nr_descendants 0 2024-10-04 15:23:33.280 nr_dying_descendants 0 2024-10-04 15:23:33.280 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.280 ==> /sys/fs/cgroup/nomad.slice/share.slice/b30be889-ebb4-8524-cc84-bfd79acd6057.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.281 ==> /sys/fs/cgroup/nomad.slice/share.slice/d2a4176c-d6f0-14d6-95ef-6b3b332be2d8.requestmachine.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.281 max 2024-10-04 15:23:33.281 ==> /sys/fs/cgroup/nomad.slice/share.slice/d2a4176c-d6f0-14d6-95ef-6b3b332be2d8.requestmachine.scope/cgroup.stat <== 2024-10-04 15:23:33.281 nr_descendants 0 2024-10-04 15:23:33.281 nr_dying_descendants 0 2024-10-04 15:23:33.281 ==> /sys/fs/cgroup/nomad.slice/share.slice/d2a4176c-d6f0-14d6-95ef-6b3b332be2d8.requestmachine.scope/cpuset.cpus <== 2024-10-04 15:23:33.281 ==> /sys/fs/cgroup/nomad.slice/share.slice/d2a4176c-d6f0-14d6-95ef-6b3b332be2d8.requestmachine.scope/cpuset.mems <== 2024-10-04 15:23:33.283 ==> /sys/fs/cgroup/nomad.slice/share.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.283 max 2024-10-04 15:23:33.283 ==> /sys/fs/cgroup/nomad.slice/share.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.283 nr_descendants 0 2024-10-04 15:23:33.283 nr_dying_descendants 0 2024-10-04 15:23:33.283 ==> /sys/fs/cgroup/nomad.slice/share.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.283 ==> /sys/fs/cgroup/nomad.slice/share.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.284 ==> /sys/fs/cgroup/nomad.slice/share.slice/ee62acdb-e12b-5f8a-4da1-1f69cf492204.requestmachine.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.284 max 2024-10-04 15:23:33.284 ==> /sys/fs/cgroup/nomad.slice/share.slice/ee62acdb-e12b-5f8a-4da1-1f69cf492204.requestmachine.scope/cgroup.stat <== 2024-10-04 15:23:33.284 nr_descendants 0 2024-10-04 15:23:33.284 nr_dying_descendants 0 2024-10-04 15:23:33.284 ==> /sys/fs/cgroup/nomad.slice/share.slice/ee62acdb-e12b-5f8a-4da1-1f69cf492204.requestmachine.scope/cpuset.cpus <== 2024-10-04 15:23:33.284 ==> /sys/fs/cgroup/nomad.slice/share.slice/ee62acdb-e12b-5f8a-4da1-1f69cf492204.requestmachine.scope/cpuset.mems <== 2024-10-04 15:23:33.285 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.webserver.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.285 max 2024-10-04 15:23:33.285 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.webserver.scope/cgroup.stat <== 2024-10-04 15:23:33.285 nr_descendants 0 2024-10-04 15:23:33.285 nr_dying_descendants 0 2024-10-04 15:23:33.285 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.webserver.scope/cpuset.cpus <== 2024-10-04 15:23:33.285 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.webserver.scope/cpuset.mems <== 2024-10-04 15:23:33.287 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.realtime-jobqueuerunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.287 max 2024-10-04 15:23:33.287 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.realtime-jobqueuerunner.scope/cgroup.stat <== 2024-10-04 15:23:33.287 nr_descendants 0 2024-10-04 15:23:33.287 nr_dying_descendants 0 2024-10-04 15:23:33.287 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.realtime-jobqueuerunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.287 ==> /sys/fs/cgroup/nomad.slice/share.slice/ed3ccb71-4a92-986d-40fd-d369708df5fd.realtime-jobqueuerunner.scope/cpuset.mems <== 2024-10-04 15:23:33.288 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.realtime-jobqueuerunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.288 max 2024-10-04 15:23:33.288 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.realtime-jobqueuerunner.scope/cgroup.stat <== 2024-10-04 15:23:33.288 nr_descendants 0 2024-10-04 15:23:33.288 nr_dying_descendants 0 2024-10-04 15:23:33.288 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.realtime-jobqueuerunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.288 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.realtime-jobqueuerunner.scope/cpuset.mems <== 2024-10-04 15:23:33.289 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.289 max 2024-10-04 15:23:33.289 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.289 nr_descendants 0 2024-10-04 15:23:33.289 nr_dying_descendants 0 2024-10-04 15:23:33.289 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.289 ==> /sys/fs/cgroup/nomad.slice/share.slice/dadef66f-4e6f-7462-f16f-901bbb7efb66.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.291 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.291 max 2024-10-04 15:23:33.291 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.291 nr_descendants 0 2024-10-04 15:23:33.291 nr_dying_descendants 0 2024-10-04 15:23:33.291 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.291 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.292 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.webserver.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.292 max 2024-10-04 15:23:33.292 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.webserver.scope/cgroup.stat <== 2024-10-04 15:23:33.292 nr_descendants 0 2024-10-04 15:23:33.292 nr_dying_descendants 0 2024-10-04 15:23:33.292 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.webserver.scope/cpuset.cpus <== 2024-10-04 15:23:33.292 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.webserver.scope/cpuset.mems <== 2024-10-04 15:23:33.293 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.293 max 2024-10-04 15:23:33.293 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.293 nr_descendants 0 2024-10-04 15:23:33.293 nr_dying_descendants 0 2024-10-04 15:23:33.293 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.293 ==> /sys/fs/cgroup/nomad.slice/share.slice/e1e36d3b-0846-e0c1-730c-f1ae72d6ee0e.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.295 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.295 max 2024-10-04 15:23:33.295 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.295 nr_descendants 0 2024-10-04 15:23:33.295 nr_dying_descendants 0 2024-10-04 15:23:33.295 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.295 ==> /sys/fs/cgroup/nomad.slice/share.slice/08b62bda-7117-0d53-3d4b-6efaa6aaecd6.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.296 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.realtime-jobqueuerunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.296 max 2024-10-04 15:23:33.296 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.realtime-jobqueuerunner.scope/cgroup.stat <== 2024-10-04 15:23:33.296 nr_descendants 0 2024-10-04 15:23:33.296 nr_dying_descendants 0 2024-10-04 15:23:33.296 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.realtime-jobqueuerunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.296 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.realtime-jobqueuerunner.scope/cpuset.mems <== 2024-10-04 15:23:33.298 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.realtime-jobqueuerunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.298 max 2024-10-04 15:23:33.298 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.realtime-jobqueuerunner.scope/cgroup.stat <== 2024-10-04 15:23:33.298 nr_descendants 0 2024-10-04 15:23:33.298 nr_dying_descendants 0 2024-10-04 15:23:33.298 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.realtime-jobqueuerunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.298 ==> /sys/fs/cgroup/nomad.slice/share.slice/d5099ad0-d1f2-fffa-04cf-153db0063081.realtime-jobqueuerunner.scope/cpuset.mems <== 2024-10-04 15:23:33.299 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.realtime-taskrunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.299 max 2024-10-04 15:23:33.299 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.realtime-taskrunner.scope/cgroup.stat <== 2024-10-04 15:23:33.299 nr_descendants 0 2024-10-04 15:23:33.299 nr_dying_descendants 0 2024-10-04 15:23:33.299 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.realtime-taskrunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.299 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.realtime-taskrunner.scope/cpuset.mems <== 2024-10-04 15:23:33.300 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.300 max 2024-10-04 15:23:33.300 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.300 nr_descendants 0 2024-10-04 15:23:33.300 nr_dying_descendants 0 2024-10-04 15:23:33.300 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.300 ==> /sys/fs/cgroup/nomad.slice/share.slice/d558c62a-7dd4-bd57-576a-41c8372e200d.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.302 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.302 max 2024-10-04 15:23:33.302 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.302 nr_descendants 0 2024-10-04 15:23:33.302 nr_dying_descendants 0 2024-10-04 15:23:33.302 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.302 ==> /sys/fs/cgroup/nomad.slice/share.slice/05485654-7197-6dbd-40ea-c799d156d5e9.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.303 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.303 max 2024-10-04 15:23:33.303 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.303 nr_descendants 0 2024-10-04 15:23:33.303 nr_dying_descendants 0 2024-10-04 15:23:33.303 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.303 ==> /sys/fs/cgroup/nomad.slice/share.slice/00d83a43-2dcf-3a7b-3784-476b16a8236a.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.304 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.promtail.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.304 max 2024-10-04 15:23:33.304 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.promtail.scope/cgroup.stat <== 2024-10-04 15:23:33.304 nr_descendants 0 2024-10-04 15:23:33.304 nr_dying_descendants 0 2024-10-04 15:23:33.304 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.promtail.scope/cpuset.cpus <== 2024-10-04 15:23:33.304 ==> /sys/fs/cgroup/nomad.slice/share.slice/cf1b65a1-f0a4-90f4-df88-8d0b1c5d5931.promtail.scope/cpuset.mems <== 2024-10-04 15:23:33.305 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.realtime-taskrunner.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.305 max 2024-10-04 15:23:33.305 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.realtime-taskrunner.scope/cgroup.stat <== 2024-10-04 15:23:33.305 nr_descendants 0 2024-10-04 15:23:33.305 nr_dying_descendants 0 2024-10-04 15:23:33.305 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.realtime-taskrunner.scope/cpuset.cpus <== 2024-10-04 15:23:33.305 ==> /sys/fs/cgroup/nomad.slice/share.slice/f154c016-5663-62ff-b484-631b2d063f30.realtime-taskrunner.scope/cpuset.mems <== 2024-10-04 15:23:33.306 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/cgroup.max.descendants <== 2024-10-04 15:23:33.306 max 2024-10-04 15:23:33.306 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/cgroup.stat <== 2024-10-04 15:23:33.306 nr_descendants 1 2024-10-04 15:23:33.306 nr_dying_descendants 11 2024-10-04 15:23:33.306 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.cpus <== 2024-10-04 15:23:33.306 0-3 2024-10-04 15:23:33.306 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/cpuset.mems <== 2024-10-04 15:23:33.307 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.realtime-gunicorn.scope/cgroup.max.descendants <== 2024-10-04 15:23:33.307 max 2024-10-04 15:23:33.307 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.realtime-gunicorn.scope/cgroup.stat <== 2024-10-04 15:23:33.307 nr_descendants 0 2024-10-04 15:23:33.307 nr_dying_descendants 0 2024-10-04 15:23:33.307 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.realtime-gunicorn.scope/cpuset.cpus <== 2024-10-04 15:23:33.307 4-7 2024-10-04 15:23:33.307 ==> /sys/fs/cgroup/nomad.slice/reserve.slice/bd759fed-5e8d-90b4-2110-94d5f79737a8.realtime-gunicorn.scope/cpuset.mems <== ```
shoenig commented 1 week ago

Hi @rodrigol-chan - just to clarify, is this only happening on this one specific node? Are there any tasks still running on this node that were originally created from before the upgrade to Nomad 1.7? Has the node been rebooted since the upgrade to Nomad 1.7?

tgross commented 1 week ago

For some additional context, we've been investigating to figure out the circumstances in which the kernel can return this "no space left on device" error in the first place.

That error is referred to as ENOSPC and in the kernel you've got for Ubuntu 22.04, there's only one place that can be returned for cgroups v2. That's in validate_change in cpuset.c#L637-L649. I'm pointing to the mirror of Torvald's tree here but I've confirmed this function is the same on Ubuntu's tree for my current 22.04 kernel:

$ git remote add jammy git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy
$ git fetch jammy # wait a while...
$ git checkout -b jammy-5.15.0-124.134 Ubuntu-5.15.0-124.134

Here's the relevant section, with a helpful comment:

/*
 * Cpusets with tasks - existing or newly being attached - can't
 * be changed to have empty cpus_allowed or mems_allowed.
 */
ret = -ENOSPC;
if ((cgroup_is_populated(cur->css.cgroup) || cur->attach_in_progress)) {
    if (!cpumask_empty(cur->cpus_allowed) &&
        cpumask_empty(trial->cpus_allowed))
        goto out;
    if (!nodes_empty(cur->mems_allowed) &&
        nodes_empty(trial->mems_allowed))
        goto out;
}

So that suggests that we're somehow ending up in a state where the cpuset is being emptied of cpus or mems allowed while the task is still live. That's the source of @shoenig's follow-up questions above.

rodrigol-chan commented 1 week ago

is this only happening on this one specific node?

No, it happens on more nodes, though I just noticed that it only happens on nodes where we allow periodic jobs to run. The nodes where we do not allow periodic jobs have the exact same configuration as the ones where we do, with the difference that they are preemptible instances, i.e. Google will arbitrarily power them off.

Are there any tasks still running on this node that were originally created from before the upgrade to Nomad 1.7? Has the node been rebooted since the upgrade to Nomad 1.7?

The oldest running allocation I see is from 18th October (7 days ago), whose job was submitted on Oct 14th. The oldest current/running job version is from 2024-06-25T14:38:16Z, a few days after the 1.7 upgrade, and the same job also contains the oldest job version that Nomad still remembers, dated 2024-05-02T09:22:13Z. The vast majority of jobs have been submitted this week since we do 20+ releases per day.

All nodes run unattended-upgrades and have rebooted since Oct 15th.


I want to clarify that we're running the linux-gcp kernel since we're on Google Cloud, so we're actually currently running the 6.8 kernel. At the time this started, I believe we were on 6.5.

$ uname -a
Linux nomad-client-camel 6.8.0-1016-gcp #18~22.04.1-Ubuntu SMP Tue Oct  8 14:58:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

that suggests that we're somehow ending up in a state where the cpuset is being emptied of cpus or mems allowed while the task is still live

I'll add some more instrumentation to look at the process tree when the issue happens. Is there any more information I can produce?

tgross commented 1 week ago

we're actually currently running the 6.8 kernel.

Ok, in the 6.8 kernel there's a second place this error can appear (ref cpuset.c#L3250-L3262), which is when the effective_cpus are empty.

I'll add some more instrumentation to look at the process tree when the issue happens. Is there any more information I can produce?

I suspect we want to look at all the cpuset files in the tree. Something like:

for f in /sys/fs/cgroup/cpuset.*; do echo -n "$f :"; cat "$f"; done
for f in /sys/fs/cgroup/nomad.slice/cpuset.*; do echo -n "$f :"; cat "$f"; done
for f in /sys/fs/cgroup/nomad.slice/*.slice/cpuset.*; do echo -n "$f :"; cat "$f"; done
for f in /sys/fs/cgroup/nomad.slice/*.slice/*.scope/cpuset.*; do echo -n "$f :"; cat "$f"; done
mvegter commented 1 week ago

Perhaps this (partially) relates to #24304 / #24297