Closed brunswyck closed 4 years ago
Whether or not these controller files are available depends on the io scheduler that is used. And for weight based files it either needs to be CFQ or BFQ. But whether or not enabling them makes sense, depends on your hardware. For NVMe it doesn't make much sense enabling a kernel scheduler as the hardware will just do this itself so it's often turned of by default. You can see this at:
cat /sys/block/sda/queue/scheduler
Disk are ordinary SATA-600 disks so no NVMe in my case. I found this on the matter: https://www.kernel.org/doc/html/latest/block/bfq-iosched.html https://stackoverflow.com/questions/1009577/selecting-a-linux-i-o-scheduler So how should I set the IO scheduler for my SATA-600's?
sudo cat /sys/block/sda/queue/scheduler
[mq-deadline] none
Your system is using SCSI multiqueue which similarly prevents those I/O schedulers from being available. I believe there's a kernel boot option to turn that off which may then get the I/O schedulers that support cgroups to show up again.
Closing as LXD appears to be behaving as expected. You asked it to configure a kernel feature which isn't available and it fails with a clear error about what's missing.
Okay. It would improve documentation if a link with some information regarding io scheduler configuration was provided here under limits.disk.priority I hope this post helps some others. Thank you.
My thoughts on this are that this should be a graceful warning rather than preventing the container from launching. There are too many things in lxd like this and it can make things too fragile in production -- too many 'gotchas' if some underlying configuration changes. (Another example of this is when a storage volume is missing, but it is not being used at all -- lxd will simply fail to start, rather than start in degraded mode or start all containers it can.)
Your system is using SCSI multiqueue which similarly prevents those I/O schedulers from being available. I believe there's a kernel boot option to turn that off which may then get the I/O schedulers that support cgroups to show up again.
@brauner, @stgraber: Is it possible that alternative I/O schedulers like bfq
use different controller files instead of /sys/fs/cgroup/blkio/blkio.weight
(like /sys/fs/cgroup/blkio/{init.scope,user.slice,system.slice}*/blkio.bfq.weight
) that could/should be used instead or am I still missing the first although cat /sys/block/sda/queue/scheduler
tells me that bfq
should be active (mq-deadline kyber [bfq] none
)? (Switching between the schedulers and remounting /sys/fs/cgroup/blkio
does not seem to change anything about that.)
It's certainly possible, though we're very unlikely to be attending support for more files in cgroup1 blkio as the majority of distros have now moved to cgroup2 or are in the process of moving to it. Any new work done on cgroup configuration will be focused on cgroup2 at this point.
Required information
Issue description
Cannot apply limits.disk.priority When I do, the container won't start
Steps to reproduce
Information to attach
[ ] Any relevant kernel output (
dmesg
) none[ ] Container log (
lxc info NAME --show-log
)Log:
lxc testcon 20200525054501.630 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1143 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.monitor.testcon" lxc testcon 20200525054501.645 ERROR cgfsng - cgroups/cgfsng.c:mkdir_eexist_on_last:1143 - File exists - Failed to create directory "/sys/fs/cgroup/cpuset//lxc.payload.testcon" lxc testcon 20200525054501.654 ERROR utils - utils.c:lxc_can_use_pidfd:1855 - Invalid argument - Kernel does not support waiting on processes through pidfds lxc testcon 20200525054501.720 WARN cgfsng - cgroups/cgfsng.c:fchowmodat:1455 - No such file or directory - Failed to fchownat(17, memory.oom.group, 1000000000, 0, AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW ) lxc 20200525054502.636 WARN commands - commands.c:lxc_cmd_rsp_recv:122 - Connection reset by peer - Failed to receive response for command "get_cgroup" lxc 20200525054502.636 WARN commands - commands.c:lxc_cmd_rsp_recv:122 - Connection reset by peer - Failed to receive response for command "get_state"
$ lxc config show testcon architecture: x86_64 config: image.architecture: amd64 image.description: ubuntu 18.04 LTS amd64 (release) (20200519.1) image.label: release image.os: ubuntu image.release: bionic image.serial: "20200519.1" image.type: squashfs image.version: "18.04" user.network-config: |-
cloud-config
volatile.base_image: 70d3dcaabcffb1aa1644d0ce866efcb141742179e94ad72aefb8d3502338a71f volatile.eth0.hwaddr: 00:16:3e:59:6b:1e volatile.idmap.base: "0" volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]' volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]' volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]' volatile.last_state.power: STOPPED devices: {} ephemeral: false profiles:
testcon stateful: false description: ""
t=2020-05-25T05:44:56+0000 lvl=info msg="LXD 4.1 is starting in normal mode" path=/var/snap/lxd/common/lxd¬
t=2020-05-25T05:44:56+0000 lvl=info msg="Kernel uid/gid map:".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - u 0 0 4294967295".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - g 0 0 4294967295".¬
t=2020-05-25T05:44:56+0000 lvl=info msg="Configured LXD uid/gid map:".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - u 0 1000000 1000000000".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - g 0 1000000 1000000000".¬
t=2020-05-25T05:44:56+0000 lvl=info msg="Kernel features:".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - netnsid-based network retrieval: yes".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - uevent injection: yes".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - seccomp listener: yes".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - seccomp listener continue syscalls: yes".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - unprivileged file capabilities: yes".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - cgroup layout: hybrid".¬
t=2020-05-25T05:44:56+0000 lvl=warn msg=" - Couldn't find the CGroup blkio.weight, I/O weight limits will be ignored".¬
t=2020-05-25T05:44:56+0000 lvl=warn msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored".¬
t=2020-05-25T05:44:56+0000 lvl=info msg=" - shiftfs support: disabled".¬
t=2020-05-25T05:44:56+0000 lvl=info msg="Initializing local database".¬
t=2020-05-25T05:44:57+0000 lvl=info msg="Starting /dev/lxd handler:".¬
t=2020-05-25T05:44:57+0000 lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock¬
t=2020-05-25T05:44:57+0000 lvl=info msg="REST API daemon:".¬
t=2020-05-25T05:44:57+0000 lvl=info msg=" - binding Unix socket" inherited=true socket=/var/snap/lxd/common/lxd/unix.socket¬
t=2020-05-25T05:44:57+0000 lvl=info msg="Initializing global database".¬
t=2020-05-25T05:44:57+0000 lvl=info msg="Firewall loaded driver \"xtables\"".¬
t=2020-05-25T05:44:57+0000 lvl=info msg="Initializing storage pools".¬
t=2020-05-25T05:44:59+0000 lvl=info msg="Initializing daemon storage mounts".¬
t=2020-05-25T05:44:59+0000 lvl=info msg="Initializing networks".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Pruning leftover image files".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Done pruning leftover image files".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Loading daemon configuration".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Started seccomp handler" path=/var/snap/lxd/common/lxd/seccomp.socket¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Pruning expired images".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Done pruning expired images".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Pruning expired instance backups".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Done pruning expired instance backups".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Updating images".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Done updating images".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Expiring log files".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Updating instance types".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Done expiring log files".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Done updating instance types".¬
t=2020-05-25T05:45:00+0000 lvl=info msg="Starting container" action=start created=2020-05-24T19:57:00+0000 ephemeral=false name=testcon project=default stateful=false used=2020-05-25T05:31:05+0000¬
t=2020-05-25T05:45:02+0000 lvl=info msg="Stopping container" action=stop created=2020-05-24T19:57:00+0000 ephemeral=false name=testcon project=default stateful=false used=2020-05-25T05:31:05+0000¬
t=2020-05-25T05:45:03+0000 lvl=info msg="Stopped container" action=stop created=2020-05-24T19:57:00+0000 ephemeral=false name=testcon project=default stateful=false used=2020-05-25T05:31:05+0000¬
t=2020-05-25T05:45:03+0000 lvl=eror msg="Failed to start instance 'testcon': Cannot apply limits.disk.priority as blkio.weight cgroup controller is missing".¬
t=2020-05-25T06:01:42+0000 lvl=info msg="Starting container" action=start created=2020-05-24T19:57:00+0000 ephemeral=false name=testcon project=default stateful=false used=2020-05-25T05:45:00+0000¬
t=2020-05-25T06:01:43+0000 lvl=info msg="Stopping container" action=stop created=2020-05-24T19:57:00+0000 ephemeral=false name=testcon project=default stateful=false used=2020-05-25T05:45:00+0000¬
t=2020-05-25T06:01:44+0000 lvl=info msg="Stopped container" action=stop created=2020-05-24T19:57:00+0000 ephemeral=false name=testcon project=default stateful=false used=2020-05-25T05:45:00+0000¬
testcon container log (var/snap/lxd/common/lxd/logs/tescon.lxc.log
Error: Cannot apply limits.disk.priority as blkio.weight cgroup controller is missing
Try
lxc info --show-log testcon
for more infolxc monitor
while reproducing the issue)location: none metadata: class: task created_at: "2020-05-25T06:47:21.598155313Z" description: Starting container err: Cannot apply limits.disk.priority as blkio.weight cgroup controller is missing id: d08cbb6a-d054-47a4-9444-53edd8268b3d location: none may_cancel: false metadata: null resources: containers:
location: none metadata: context: action: stop created: 2020-05-24 19:57:00.248990143 +0000 UTC ephemeral: "false" name: testcon project: default stateful: "false" used: 2020-05-25 06:36:27.679405421 +0000 UTC level: info message: Stopped container timestamp: "2020-05-25T06:47:24.373196634Z" type: logging