lxc / lxcfs

FUSE filesystem for LXC
https://linuxcontainers.org/lxcfs
Other
1.05k stars 250 forks source link

seems to not work in chroot #502

Open aqueos opened 2 years ago

aqueos commented 2 years ago

hi,

i was trying to use lxcfs inside a chroot but it seems that the lxcfs do not trigger the virtual system. Could you tell me what is needed at minimum to make it work ( like capabilities ) and how the lxcfs detect and trigger the virtualisation (could not find it in the code but i am not a dev so...) :)

i tried to find by looking the source but i could not. i can mount the FS and i have all my process in cgroups but the limit do not show in the /proc files like meminfo or cpu etc...

thanks a lot for your help !

regards, Ghislain.

stgraber commented 2 years ago

LXCFS is a FUSE filesystem, so to run it, you need access to FUSE (/dev/fuse and /sys/fs/fuse/connections). Then when running LXCFS, the target path will be populated with files that account for your cgroup limits. You then need to mount those over the matching original files to replace them with the LXCFS version.

All this is usually done automatically by container managers.

aqueos commented 2 years ago

hi,

I set it up manually as it is not lxd/lxc but a custom chroot where i put the process in a cgroup too

i have inside this 'container'

lxcfs on /proc/cpuinfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/diskstats type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/loadavg type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/meminfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/stat type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/swaps type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other) lxcfs on /proc/uptime type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)

in the chroot but mounted with

/var/lib/lxcfs/proc/cpuinfo /proc/cpuinfo none bind,fuse.rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0 /var/lib/lxcfs/proc/diskstats /proc/diskstats none bind,fuse.rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0 /var/lib/lxcfs/proc/loadavg /proc/loadavg none bind,fuse.rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0 /var/lib/lxcfs/proc/meminfo /proc/meminfo none bind,fuse.rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0 /var/lib/lxcfs/proc/stat /proc/stat none bind,fuse.rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0 /var/lib/lxcfs/proc/swaps /proc/swaps none bind,fuse.rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0 /var/lib/lxcfs/proc/uptime /proc/uptime none bind,fuse.rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0

@testlxcfs:[~]: cat /proc/meminfo MemTotal: 32912260 kB MemFree: 29939332 kB

[~]: cat /sys/fs/cgroup/testlxcfs/memory.limit_in_bytes 12884901888

is there a way to debug to see if it search the info at the wrong place ? i am not in a pid namesapce or usernamespace, just a cgroup inside the chroot.

regards, Ghislain

ps lxcfs 4.0.11.

aqueos commented 2 years ago

i must add, the "container" is in a mount namespace and the system mount the lxcfs inside it after starting the fake init process.

aqueos commented 2 years ago

do lxcfs workif the process are not in a pid namespace but only in a cgroup ?

brauner commented 2 years ago

do lxcfs workif the process are not in a pid namespace but only in a cgroup ?

Yeah, most features should work. We have had people send us patches for that.

aqueos commented 2 years ago

hi,

i tried a testbed with a chroot:

I created a cgroup with limits :

VSHOST:root@195-154-107-122:[~]: cgget testlxcfs|grep limit_in_bytes
memory.limit_in_bytes: 170188800
memory.memsw.limit_in_bytes: 170188800
memory.kmem.tcp.limit_in_bytes: 9223372036854771712
memory.kmem.limit_in_bytes: 170188800
memory.soft_limit_in_bytes: 170188800
hugetlb.1GB.limit_in_bytes: 9223372035781033984
hugetlb.2MB.limit_in_bytes: 9223372036852678656

Then mounted the proc and the lxcfs on top of it:

/dev/md125 on /vservers/testlxcfs type ext4 (rw,relatime,data=ordered)
udev on /vservers/testlxcfs/dev type devtmpfs (rw,nosuid,relatime,size=16411708k,nr_inodes=4102927,mode=755)
devpts on /vservers/testlxcfs/dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
none on /vservers/testlxcfs/dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /vservers/testlxcfs/dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=7363100k)
/proc on /vservers/testlxcfs/proc type proc (rw,relatime)
/sys on /vservers/testlxcfs/sys type sysfs (rw,relatime)
none on /vservers/testlxcfs/dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
none on /vservers/testlxcfs/run type tmpfs (rw,relatime)
none on /vservers/testlxcfs/run/lock type tmpfs (rw,relatime)
lxcfs on /vservers/testlxcfs/proc/cpuinfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /vservers/testlxcfs/proc/diskstats type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /vservers/testlxcfs/proc/loadavg type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /vservers/testlxcfs/proc/meminfo type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /vservers/testlxcfs/proc/stat type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /vservers/testlxcfs/proc/swaps type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /vservers/testlxcfs/proc/uptime type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)
lxcfs on /vservers/testlxcfs/sys/devices/system/cpu/online type fuse.lxcfs (rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other)

then i enter the chroot in the cgroup


VSHOST:root@195-154-107-122:[~]: cgexec -g *:testlxcfs chroot  /vservers/testlxcfs /bin/bash
VSGUEST:root@195-154-107-122:[~]: 

VSGUEST:root@195-154-107-122:[~]: free
              total        used        free      shared  buff/cache   available
Mem:       32912020     1205420    31601396        1764      105204    31427944
Swap:       3903484           0     3903484
VSGUEST:root@195-154-107-122:[~]: head /proc/meminfo
MemTotal:       32912020 kB
MemFree:        31601264 kB
MemAvailable:   31427860 kB

strace.txt

so in the chroot the meminfo is not "cgroupized" .

cpuinfo etc do not virtualise either.

i attach a strace when i do a "free" in the chroot. Did i missed something in my setup ?

regards, Ghislain.

aqueos commented 2 years ago

ok, for it to work it seems mounting cgroup is not enough, you have to mount each cgroup in a different mount and cgroup2 in unified like a systemd do it with 20 mounts instead of just one.

so it seems to be really looking for a specific cgroup mount space and not just existing cgroup of process :)

mihalicyn commented 7 months ago

I'm not sure if it's still actual or not. Let's keep this issue for now, but I'm not sure how many people are really interested in running LXCFS in a chroot environment.