lxc / lxcfs

FUSE filesystem for LXC
https://linuxcontainers.org/lxcfs
Other
1.05k stars 250 forks source link
c containers fuse-filesystem lxc

lxcfs

Introduction

LXCFS is a small FUSE filesystem written with the intention of making Linux containers feel more like a virtual machine. It started as a side-project of LXC but is useable by any runtime.

LXCFS will take care that the information provided by crucial files in procfs such as:

/proc/cpuinfo
/proc/diskstats
/proc/meminfo
/proc/stat
/proc/swaps
/proc/uptime
/proc/slabinfo
/sys/devices/system/cpu/online

are container aware such that the values displayed (e.g. in /proc/uptime) really reflect how long the container is running and not how long the host is running.

Prior to the implementation of cgroup namespaces by Serge Hallyn LXCFS also provided a container aware cgroupfs tree. It took care that the container only had access to cgroups underneath it's own cgroups and thus provided additional safety. For systems without support for cgroup namespaces LXCFS will still provide this feature but it is mostly considered deprecated.

Upgrading LXCFS without restart

LXCFS is split into a shared library (a libtool module, to be precise) liblxcfs and a simple binary lxcfs. When upgrading to a newer version of LXCFS the lxcfs binary will not be restarted. Instead it will detect that a new version of the shared library is available and will reload it using dlclose(3) and dlopen(3). This design was chosen so that the fuse main loop that LXCFS uses will not need to be restarted. If it were then all containers using LXCFS would need to be restarted since they would otherwise be left with broken fuse mounts.

To force a reload of the shared library at the next possible instance simply send SIGUSR1 to the pid of the running LXCFS process. This can be as simple as doing:

rm /usr/lib64/lxcfs/liblxcfs.so # MUST to delete the old library file first
cp liblxcfs.so /usr/lib64/lxcfs/liblxcfs.so # to place new library file
kill -s USR1 $(pidof lxcfs) # reload

musl

To achieve smooth upgrades through shared library reloads LXCFS also relies on the fact that when dlclose(3) drops the last reference to the shared library destructors are run and when dlopen(3) is called constructors are run. While this is true for glibc it is not true for musl (See the section Unloading libraries.). So users of LXCFS on musl are advised to restart LXCFS completely and all containers making use of it.

Building

In order to build LXCFS install fuse and the fuse development headers according to your distro. LXCFS prefers fuse3 but does work with new enough fuse2 versions:

git clone git://github.com/lxc/lxcfs
cd lxcfs
meson setup -Dinit-script=systemd --prefix=/usr build/
meson compile -C build/
sudo meson install -C build/

To build with sanitizers you have to specify -Db_sanitize=... option to meson setup. For example, to enable ASAN and UBSAN:

meson setup -Dinit-script=systemd --prefix=/usr build/ -Db_sanitize=address,undefined
meson compile -C build/

Usage

The recommended command to run lxcfs is:

sudo mkdir -p /var/lib/lxcfs
sudo lxcfs /var/lib/lxcfs

A container runtime wishing to use LXCFS should then bind mount the approriate files into the correct places on container startup.

LXC

In order to use lxcfs with systemd-based containers, you can either use LXC 1.1 in which case it should work automatically, or otherwise, copy the lxc.mount.hook and lxc.reboot.hook files (once built) from this tree to /usr/share/lxcfs, make sure it is executable, then add the following lines to your container configuration:

lxc.mount.auto = cgroup:mixed
lxc.autodev = 1
lxc.kmsg = 0
lxc.include = /usr/share/lxc/config/common.conf.d/00-lxcfs.conf

Using with Docker

docker run -it -m 256m --memory-swap 256m \
      -v /var/lib/lxcfs/proc/cpuinfo:/proc/cpuinfo:rw \
      -v /var/lib/lxcfs/proc/diskstats:/proc/diskstats:rw \
      -v /var/lib/lxcfs/proc/meminfo:/proc/meminfo:rw \
      -v /var/lib/lxcfs/proc/stat:/proc/stat:rw \
      -v /var/lib/lxcfs/proc/swaps:/proc/swaps:rw \
      -v /var/lib/lxcfs/proc/uptime:/proc/uptime:rw \
      -v /var/lib/lxcfs/proc/slabinfo:/proc/slabinfo:rw \
      -v /var/lib/lxcfs/sys/devices/system/cpu:/sys/devices/system/cpu:rw \
      ubuntu:18.04 /bin/bash

In a system with swap enabled, the parameter "-u" can be used to set all values in "meminfo" that refer to the swap to 0.

sudo lxcfs -u /var/lib/lxcfs

Swap handling

If you noticed LXCFS not showing any SWAP in your container despite having SWAP on your system, please read this section carefully and look for instructions on how to enable SWAP accounting for your distribution.

Swap cgroup handling on Linux is very confusing and there just isn't a perfect way for LXCFS to handle it.

Terminology used below:

The main issues are:

As a result, LXCFS had to make some compromise which go as follow:

Issue reporting

Core dump

In case of LXCFS crash, it can be extremely useful for us to have a core dump of the LXCFS process memory.

  1. Please, check /var/crash and coredumpctl list just in case if you already have an LXCFS core dump file
  2. If not, you can use the following way to collect it from your system:

On the machine where you run LXCFS, execute as root:

# save an old core_pattern setting value:
cat /proc/sys/kernel/core_pattern > /root/core_pattern.old_value.bak

# set a new one to collect all core dumps:
echo '|/bin/sh -c $@ -- eval exec gzip --fast > /var/crash/core-%e.%p.gz' > /proc/sys/kernel/core_pattern

# wait for the next LXCFS crash and check
ls -lah /var/crash

# there should be a file with a name like "core-lxcfs.80581.gz". Please, upload it somewhere and share with us.

# restore the old "core_pattern" value:
cat /root/core_pattern.old_value.bak > /proc/sys/kernel/core_pattern