incus ls, does not show Number of Processes

jalbstmeijer commented 4 months ago

Hi,

Opening also an issue here, as it looks like this issue more related to the docker image than Incus itself.

I do not get a 'Number of Processes' value, using Incus 6.2.

 incus ls -c n,N
+------------+-----------+
|    NAME    | PROCESSES |
+------------+-----------+
| container1 | -1        |
+------------+-----------+

https://github.com/lxc/incus/issues/970

Kind regards,

J

cmspam commented 4 months ago

Hello, thanks for bringing it to my attention.

I think you're running it with docker, and I've generally done must of my testing with podman, you might notice there are some cgroup-specific settings that only seem to work with podman.

I'm pretty sure this would be resolved by passing through something that's needed but not available to the container, but my knowledge is more podman-centric than docker-centric. So we'll have to figure out what it might be exactly.

That said, could you please let me know the command you used to run the container? Is it the same as in the README? Also, if you check the log, do you see anything unusual?

jalbstmeijer commented 4 months ago

That said, could you please let me know the command you used to run the container? Is it the same as in the README?

docker run -d --name incus --privileged --env SETIPTABLES=true --env USELXCFS=true --restart always --network host --volume /dev:/dev --volume /var/lib/incus:/var/lib/incus --volume /var/lib/incus-lxcfs:/var/lib/incus-lxcfs --volume /lib/modules:/lib/modules:ro ghcr.io/cmspam/incus-docker:latest

Also, if you check the log, do you see anything unusual?

iptables: No chain/target/match by that name.
ip6tables: No chain/target/match by that name.
ip6tables: No chain/target/match by that name.
Starting LXCFS at /opt/incus/bin/lxcfs
Ignoring invalid max threads value 4294967295 > max (100000).
Using runtime path /run
Running lxcfslib_init to reload liblxcfs
mount namespace: 7
hierarchies:
0: fd:   8: cpuset,cpu,io,memory,hugetlb,pids,misc
Kernel supports pidfds
Kernel supports swap accounting
api_extensions:
- cgroups
- sys_cpu_online
- proc_cpuinfo
- proc_diskstats
- proc_loadavg
- proc_meminfo
- proc_stat
- proc_swaps
- proc_uptime
- proc_slabinfo
- shared_pidns
- cpuview_daemon
- loadavg_daemon
- pidfds
Using default interface naming scheme 'v252'.
time="2024-07-03T14:25:48Z" level=warning msg="AppArmor support has been disabled because of lack of kernel support"
time="2024-07-03T14:25:48Z" level=warning msg=" - AppArmor support has been disabled, Disabled because of lack of kernel support"
time="2024-07-03T14:25:48Z" level=warning msg=" - Couldn't find the CGroup memory swap accounting, swap limits will be ignored"
time="2024-07-03T14:25:48Z" level=warning msg="Instance type not operational" driver=qemu err="KVM support is missing (no /dev/kvm)" type=virtual-machine
time="2024-07-03T14:25:49Z" level=error msg="balance: Unable to set cpuset" err="setting cgroup item for the container failed" name=container1 value="0,1"
time="2024-07-03T14:25:49Z" level=error msg="balance: Unable to set cpuset" err="setting cgroup item for the container failed" name=container1 value="0,1"
time="2024-07-03T14:25:49Z" level=warning msg="Failed getting process count" audit_architecture=3221225534 container=container1 err="open /sys/fs/cgroup/lxc.container1/pids.current: no such file or directory" project=default seccomp_notify_fd=37 seccomp_notify_flags=0 seccomp_notify_id=4356026212930604105 seccomp_notify_mem_fd=36 seccomp_notify_pid=78 syscall_number=99

cmspam commented 4 months ago

Please try adding --pid=host when running. This was added to the README, perhaps after you had initially started using the image. It resolves some other issues with cgroups so this may fix it.

jalbstmeijer commented 4 months ago

After adding --pid=host, containers won't start anymore.

lxc container1 20240703143713.309 ERROR    cgfsng - ../src/lxc/cgroups/cgfsng.c:__initialize_cgroups:3898 - Invalid cross-device link - Failed to open 10/../..
lxc container1 20240703143713.309 ERROR    cgfsng - ../src/lxc/cgroups/cgfsng.c:initialize_cgroups:4109 - Invalid cross-device link - Failed to initialize cgroups
lxc container1 20240703143713.309 ERROR    cgroup - ../src/lxc/cgroups/cgroup.c:cgroup_init:34 - Bad file descriptor - Failed to initialize cgroup driver
lxc container1 20240703143713.309 ERROR    start - ../src/lxc/start.c:lxc_init:863 - Failed to initialize cgroup driver
lxc container1 20240703143713.309 ERROR    start - ../src/lxc/start.c:__lxc_start:2034 - Failed to initialize container "container1"
lxc container1 20240703143713.929 ERROR    lxccontainer - ../src/lxc/lxccontainer.c:wait_on_daemonized_start:829 - No such file or directory - Failed to receive the container state

cmspam commented 4 months ago

Thanks for checking.

Could you give me the result of: mount | grep cgroup

I would like to check if it's cgroup or cgroup2

jalbstmeijer commented 4 months ago

mount | grep cgroup

cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)

adding this seems to do the trick;

--pid=host --cgroupns=host

Not sure if I also need alternatives for the podman example options;

--cgroups=no-conmon \
--security-opt unmask=/sys/fs/cgroup \

cmspam commented 4 months ago

To confirm, adding the below resolves the issue you have opened, or it just allows the containers to run? --pid=host --cgroupns=host

I believe it should also resolve

time="2024-07-03T14:25:49Z" level=error msg="balance: Unable to set cpuset" err="setting cgroup item for the container failed" name=container1 value="0,1"

from the log file, but please let me know if it is resolved.

As long as those parts are all resolved, the cgroup functionality is working the same as on podman, so the other options are probably not needed.

jalbstmeijer commented 4 months ago

To confirm, adding the below resolves the issue you have opened, or it just allows the containers to run? --pid=host --cgroupns=host

I believe it should also resolve
time="2024-07-03T14:25:49Z" level=error msg="balance: Unable to set cpuset" err="setting cgroup item for the container failed" name=container1 value="0,1"
from the log file, but please let me know if it is resolved.

Container starts after adding both options, process count, memory usage work and no cpuset errors are logged anymore.

So this issue is solved.

cmspam commented 4 months ago

Great. Thanks. I have updated the readme to add the --pid=host --cgroupns=host for docker so people wouldn't have the same issue in the future.

cmspam / incus-docker

incus ls, does not show Number of Processes #5