lxc / lxcfs

FUSE filesystem for LXC
https://linuxcontainers.org/lxcfs
Other
1.04k stars 251 forks source link

Reading `/proc/cpuinfo` in a 32bit arm container on arm64 host fails for large numbers of CPUs #608

Closed gibmat closed 1 year ago

gibmat commented 1 year ago

Introduction

Earlier this year it was reported that on certain Debian armel/armhf builders, /proc/cpuinfo was empty (bug report). Eventually this was tracked down to occurring in lxc containers running armel/armhf builds on arm64 hosts. (lxc version 5.0.2 and lxcfs version 5.0.3.)

Today, I finally had time to sit down, spin up VMs in QEMU and identify the root cause: the fix for #553 contains an edge case bug. When there are more than ~13 CPUs available to an arm64 host, any 32bit arm containers will fail to provide a populated /proc/cpuinfo due to exhausting the cache in proc_cpuinfo_read().

Problem

Code in the arm64 kernel (link) checks the process' personality when called, and if it's 32bit, introduces another line to each CPU reported in /proc/cpuinfo (for example model name : ARMv8 Processor rev 1 (v8l)). That's about 40 bytes additional output per CPU that doesn't appear if called from a process with a 64bit personality. (The reported feature flags will also change, but at least in my QEMU setup the overall length of that line was about the same for 32bit and 64bit invocations.)

Currently, within proc_cpuinfo_read() the variable cache_size is set from the value of d->buflen, which is initialized in src/proc_fuse.c as info->buflen = get_procfile_size(path) + BUF_RESERVE_SIZE. get_procfile_size() simply gobbles up the specified file and reports its total size, but does so without accounting for any personality differences.

Therefore, when lxcfs sets up the expected size of the buffer for /proc/cpuinfo on an arm64 host, it unconditionally sets the buffer to expect the output of a 64bit system. When an armel/armhf container comes along, the result that is actually read from the host's /proc/cpuinfo is ~40 bytes longer per CPU which eats into BUF_RESERVE_SIZE (512 bytes), causing proc_cpuinfo_read() to fail if there are more than 512/~40 = ~13 CPUs on the host system.

Reproducing

QEMU is limited to 8 CPUs for an arm VM, so I simply shrunk the value of BUF_RESERVE_SIZE from 512 to 128. With that change, I can successfully read /proc/cpuinfo from a 32bit container with up to 3 CPUs on the arm64 host, but bumping up to 4 CPUs breaks in the same way as reported in the original Debian bug.

Fix

get_procfile_size() needs to be made aware of personalities as well, similar to the logic in proc_read_with_personality().

(edit to fix typo: sysfile -> procfile... whoops!)

mihalicyn commented 1 year ago

Hi Mathias!

Thanks for your report, I'll take a look and fix it if you don't want to send a fix by yourself.

gibmat commented 1 year ago

I wasn't sure what the preferred way to fix this would be (add a similar get_procfile_size_with_personality() method?), so I don't have any patch to submit.