containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.44k stars 785 forks source link

Unprivileged buildah container fails to build image with fuse: device not found, try 'modprobe fuse' first fuse-overlayfs: cannot mount: No such file or directory #5456

Open SohamChakraborty opened 7 months ago

SohamChakraborty commented 7 months ago

BUG REPORT INFORMATION

Description Thank you for this project. We are able to overcome some long standing problems with buildah :)

On to the problem:

buildah fails to mount new container with error message:

time="2024-04-03T10:38:07Z" level=error msg="Unmounting /var/lib/containers/storage/overlay/5c0f234c09da0a1c5407bfaeaed8f40cd62b8cc5b1db38dca04681ae25c73505/merged: invalid argument"
Error: mounting new container: mounting build container "91fa490cee3bb03626882ecd44d0911d69e5743dbe4bcc8b1ab37ac8778579b9": creating overlay mount to /var/lib/containers/storage/overlay/5c0f234c09da0a1c5407bfaeaed8f40cd62b8cc5b1db38dca04681ae25c73505/merged, mount_data="lowerdir=/var/lib/containers/storage/overlay/l/3PIDT2HN3ICE5PVGZ4NIZ3FEHH:/var/lib/containers/storage/overlay/l/Q3PE3QRBTNMUNCGBFDAD4P5LIQ:/var/lib/containers/storage/overlay/l/J77A5AKXJWH3BTWH7NDVQ2DNVZ:/var/lib/containers/storage/overlay/l/BO7DDGKEGJLKIKQBXMYPHAEMQI:/var/lib/containers/storage/overlay/l/JBE7KJRAFYL3TLUFIOBM6UIMPG:/var/lib/containers/storage/overlay/l/XMUY4WUEPMCB3JHAPAKQI4HF6V:/var/lib/containers/storage/overlay/l/R3DEIASPCEB7P5OT6VKAPDNHFD:/var/lib/containers/storage/overlay/l/EYDO7PF2FPO5Y5HW64C6TDAPQP:/var/lib/containers/storage/overlay/l/R5AEMKW4TZJPT4QVEFBUVZXGLQ,upperdir=/var/lib/containers/storage/overlay/5c0f234c09da0a1c5407bfaeaed8f40cd62b8cc5b1db38dca04681ae25c73505/diff,workdir=/var/lib/containers/storage/overlay/5c0f234c09da0a1c5407bfaeaed8f40cd62b8cc5b1db38dca04681ae25c73505/work,nodev,fsync=0,volatile": using mount program /usr/bin/fuse-overlayfs: unknown argument ignored: lazytime
fuse: device not found, try 'modprobe fuse' first
fuse-overlayfs: cannot mount: No such file or directory

Steps to reproduce the issue: We are evaluating buildah to replace docker in our Jenkins pipeline because of the docker socket problem. We were evaluating kaniko before and it required us to change our Dockerfiles among other problems. So we are evaluating buildah and made very good progress until this final roadblock (we hope).

  1. We are using Jenkins 2.440.2. But this is not a Jenkins problem, giving it as a reference.
  2. The .jenkins/agents.yaml has this spec for the buildah container:
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: jenkins
  annotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
    container.apparmor.security.beta.kubernetes.io/buildah: unconfined
spec:
  namespace: jenkins
  serviceAccount: jenkins
  runAsUser: 1000
  runAsGroup: 1000
  podRetention: OnFailure
  volumes:
    - name: gradle-cache
      hostPath:
        path: /tmp/jenkins/gradle-cache
    - name: frontend-cache
      hostPath:
        path: /tmp/jenkins/frontend-cache
    - name: varlibcontainers
      emtpyDir: {}
  containers:
    - name: buildah
      image: quay.io/buildah/stable:v1.35.0
      tty: true
      command:
        - cat
      securityContext:
       capabilities:
         add:
           - "SYS_ADMIN"
           - "MKNOD"
           - "SYS_CHROOT"
           - "SETFCAP"        
      volumeMounts:
        - name: varlibcontainers
          mountPath: /var/lib/containers        
      labels:
        agent: buildah
<SNIPPED_OTHER_CONTAINERS>
  1. It is running in a kOps cluster on AWS.
  2. When we use privielged: true everything works normally. No problems at all. So we already have a solution but we would prefer to not use that.
  3. The initial problem was related to apparmor profiles but as you can see we have solved that with unconfined profile thanks to @rhatdan's suggestion on other issue reports.
  4. But now it is failing with the fuse-overlayfs mount problem. To circumvent this, while the build was running, I went ahead and SSH-ed into the node where Jenkins pod was running and installed fuse-overlayfs. See details:
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  fuse-overlayfs
0 upgraded, 1 newly installed, 0 to remove and 59 not upgraded.
Need to get 38.7 kB of archives.
After this operation, 112 kB of additional disk space will be used.
Get:1 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu focal/universe amd64 fuse-overlayfs amd64 0.7.6-1 [38.7 kB]
Fetched 38.7 kB in 0s (631 kB/s)          
Selecting previously unselected package fuse-overlayfs.
(Reading database ... 90825 files and directories currently installed.)
Preparing to unpack .../fuse-overlayfs_0.7.6-1_amd64.deb ...
Unpacking fuse-overlayfs (0.7.6-1) ...
Setting up fuse-overlayfs (0.7.6-1) ...
Processing triggers for man-db (2.9.1-1) ...
# modprobe fuse
# modinfo fuse
name:           fuse
filename:       (builtin)
alias:          devname:fuse
alias:          char-major-10-229
alias:          fs-fuseblk
alias:          fs-fuse
license:        GPL
file:           fs/fuse/fuse
description:    Filesystem in Userspace
author:         Miklos Szeredi <miklos@szeredi.hu>
alias:          fs-fusectl
parm:           max_user_bgreq:Global limit for the maximum number of backgrounded requests an unprivileged user can set (uint)
parm:           max_user_congthresh:Global limit for the maximum congestion threshold an unprivileged user can set (uint)
# ll /dev/fuse 
crw-rw-rw- 1 root root 10, 229 Mar 29 07:05 /dev/fuse

Describe the results you received: Receiving error:

Copying config sha256:05455a08881ea9cf0e752bc48e61bbd71a34c029bb13df01e40e3e70e0d007bd
Writing manifest to image destination
time="2024-04-03T10:37:48Z" level=error msg="Unmounting /var/lib/containers/storage/overlay/7a5aac811235289d2b4c5a8b5fcdb407e5b7e46319db68b29df8d7f06c9bcb20/merged: invalid argument"
Error: mounting new container: mounting build container "6ec9bc1c936b5afea9344009c8475888cd2b279abdf752fdaf0c87f4958654ef": creating overlay mount to /var/lib/containers/storage/overlay/7a5aac811235289d2b4c5a8b5fcdb407e5b7e46319db68b29df8d7f06c9bcb20/merged, mount_data="lowerdir=/var/lib/containers/storage/overlay/l/XEWHQZTPZWA3JE4DTCLFF53R4W,upperdir=/var/lib/containers/storage/overlay/7a5aac811235289d2b4c5a8b5fcdb407e5b7e46319db68b29df8d7f06c9bcb20/diff,workdir=/var/lib/containers/storage/overlay/7a5aac811235289d2b4c5a8b5fcdb407e5b7e46319db68b29df8d7f06c9bcb20/work,nodev,fsync=0,volatile": using mount program /usr/bin/fuse-overlayfs: unknown argument ignored: lazytime
fuse: device not found, try 'modprobe fuse' first
fuse-overlayfs: cannot mount: No such file or directory
: exit status 1

Describe the results you expected: Builds to go through

Output of rpm -q buildah or apt list buildah:

# rpm -q buildah
buildah-1.35.0-1.fc39.x86_64

Output of buildah version:

# buildah version
Version:         1.35.0
Go Version:      go1.21.7
Image Spec:      1.1.0
Runtime Spec:    1.1.0
CNI Spec:        1.0.0
libcni Version:  
image Version:   5.30.0
Git Commit:      
Built:           Thu Mar  7 13:20:46 2024
OS/Arch:         linux/amd64
BuildPlatform:   linux/amd64

*Output of `cat /etc/release`:**

# cat /etc/os-release 
NAME="Fedora Linux"
VERSION="39 (Container Image)"
ID=fedora
VERSION_ID=39
VERSION_CODENAME=""
PLATFORM_ID="platform:f39"
PRETTY_NAME="Fedora Linux 39 (Container Image)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:39"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f39/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=39
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=39
SUPPORT_END=2024-11-12
VARIANT="Container Image"
VARIANT_ID=container

# cat /etc/redhat-release 
Fedora release 39 (Thirty Nine)

# cat /etc/fedora-release 
Fedora release 39 (Thirty Nine)

Output of uname -a:

# uname -a
Linux org_name-pr-xxxx-84-3sr25-4r5m4-k0ltx 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22 UTC 2022 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# cat /etc/containers/storage.conf 
# This file is the configuration file for all tools
# that use the containers/storage library. The storage.conf file
# overrides all other storage.conf files. Container engines using the
# container/storage library do not inherit fields from other storage.conf
# files.
#
#  Note: The storage.conf file overrides other storage.conf files based on this precedence:
#      /usr/containers/storage.conf
#      /etc/containers/storage.conf
#      $HOME/.config/containers/storage.conf
#      $XDG_CONFIG_HOME/containers/storage.conf (If XDG_CONFIG_HOME is set)
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver, Must be set for proper operation.
driver = "overlay"

# Temporary storage location
runroot = "/run/containers/storage"

# Primary Read/Write location of container storage
# When changing the graphroot location on an SELINUX system, you must
# ensure  the labeling matches the default locations labels with the
# following commands:
# semanage fcontext -a -e /var/lib/containers/storage /NEWSTORAGEPATH
# restorecon -R -v /NEWSTORAGEPATH
graphroot = "/var/lib/containers/storage"

# Optional alternate location of image store if a location separate from the
# container store is required. If set, it must be different than graphroot.
# imagestore = ""

# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

# Transient store mode makes all container metadata be saved in temporary storage
# (i.e. runroot above). This is faster, but doesn't persist across reboots.
# Additional garbage collection must also be performed at boot-time, so this
# option should remain disabled in most configurations.
# transient_store = true

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
"/var/lib/shared",
"/usr/lib/containers/storage",
]

# Allows specification of how storage is populated when pulling images. This
# option can speed the pulling process of images compressed with format
# zstd:chunked. Containers/storage looks for files within images that are being
# pulled from a container registry that were previously pulled to the host.  It
# can copy or create a hard link to the existing file when it finds them,
# eliminating the need to pull them from the container registry. These options
# can deduplicate pulling of content, disk storage of content and can allow the
# kernel to use less memory when running containers.

# containers/storage supports three keys
#   * enable_partial_images="true" | "false"
#     Tells containers/storage to look for files previously pulled in storage
#     rather then always pulling them from the container registry.
#   * use_hard_links = "false" | "true"
#     Tells containers/storage to use hard links rather then create new files in
#     the image, if an identical file already existed in storage.
#   * ostree_repos = ""
#     Tells containers/storage where an ostree repository exists that might have
#     previously pulled content which can be used when attempting to avoid
#     pulling content from the container registry
pull_options = {enable_partial_images = "true", use_hard_links = "false", ostree_repos=""}

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = "0:1668442479:65536"
# remap-gids = "0:1668442479:65536"

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps. This setting overrides the
# Remap-UIDs/GIDs setting.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partitioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the maximum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = "false"

# Inodes is used to set a maximum inodes of the container image.
# inodes = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,fsync=0"

# Set to skip a PRIVATE bind mount on the storage home directory.
# skip_mount_home = "false"

# Set to use composefs to mount data layers with overlay.
# use_composefs = "false"

# Size is used to set a maximum size of the container image.
# size = ""

# ForceMask specifies the permissions mask that is used for new files and
# directories.
#
# The values "shared" and "private" are accepted.
# Octal permission masks are also accepted.
#
#  "": No value specified.
#     All files/directories, get set with the permissions identified within the
#     image.
#  "private": it is equivalent to 0700.
#     All files/directories get set with 0700 permissions.  The owner has rwx
#     access to the files. No other users on the system can access the files.
#     This setting could be used with networked based homedirs.
#  "shared": it is equivalent to 0755.
#     The owner has rwx access to the files and everyone else can read, access
#     and execute them. This setting is useful for sharing containers storage
#     with other users.  For instance have a storage owned by root but shared
#     to rootless users as an additional store.
#     NOTE:  All files within the image are made readable and executable by any
#     user on the system. Even /etc/shadow within your image is now readable by
#     any user.
#
#   OCTAL: Users can experiment with other OCTAL Permissions.
#
#  Note: The force_mask Flag is an experimental feature, it could change in the
#  future.  When "force_mask" is set the original permission mask is stored in
#  the "user.containers.override_stat" xattr and the "mount_program" option must
#  be specified. Mount programs like "/usr/bin/fuse-overlayfs" present the
#  extended attribute permissions to processes within containers rather than the
#  "force_mask"  permissions.
#
# force_mask = ""

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# metadata_size is used to set the `pvcreate --metadatasize` options when
# creating thin devices. Default is 128k
# metadata_size = ""

# Size is used to set a maximum size of the container image.
# size = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"
flouthoc commented 7 months ago

@SohamChakraborty I think you need to load fuse kernel module, see: https://github.com/containers/podman/blob/main/troubleshooting.md#24-podman-container-images-fail-with-fuse-device-not-found-when-run for more details.

SohamChakraborty commented 7 months ago

@SohamChakraborty I think you need to load fuse kernel module, see: https://github.com/containers/podman/blob/main/troubleshooting.md#24-podman-container-images-fail-with-fuse-device-not-found-when-run for more details.

We did that @flouthoc . Granted we did not do that before the Jenkins job ran. Because I didn't know in which node the jenkins pod will be scheduled. So I waited for the job to run and then SSH-ed to the node where it was running and installed it. Not sure whether that might influence it. But I can definitely say that by the time buildah tried to build the image (when it actually needs fuse), the kernel module was present.

From the description of the issue:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  fuse-overlayfs
0 upgraded, 1 newly installed, 0 to remove and 59 not upgraded.
Need to get 38.7 kB of archives.
After this operation, 112 kB of additional disk space will be used.
Get:1 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu focal/universe amd64 fuse-overlayfs amd64 0.7.6-1 [38.7 kB]
Fetched 38.7 kB in 0s (631 kB/s)          
Selecting previously unselected package fuse-overlayfs.
(Reading database ... 90825 files and directories currently installed.)
Preparing to unpack .../fuse-overlayfs_0.7.6-1_amd64.deb ...
Unpacking fuse-overlayfs (0.7.6-1) ...
Setting up fuse-overlayfs (0.7.6-1) ...
Processing triggers for man-db (2.9.1-1) ...
# modprobe fuse
# modinfo fuse
name:           fuse
filename:       (builtin)
alias:          devname:fuse
alias:          char-major-10-229
alias:          fs-fuseblk
alias:          fs-fuse
license:        GPL
file:           fs/fuse/fuse
description:    Filesystem in Userspace
author:         Miklos Szeredi <miklos@szeredi.hu>
alias:          fs-fusectl
parm:           max_user_bgreq:Global limit for the maximum number of backgrounded requests an unprivileged user can set (uint)
parm:           max_user_congthresh:Global limit for the maximum congestion threshold an unprivileged user can set (uint)
# ll /dev/fuse 
crw-rw-rw- 1 root root 10, 229 Mar 29 07:05 /dev/fuse
rhatdan commented 7 months ago

If this is an SELinux system, it could be SELinux blocking the automatic loading of the kernel module. Can you cause the module to be loaded on boot via /etc/modules-load.d/

SohamChakraborty commented 7 months ago

This is not an SELinux system (as much as I hate to admit it) :)

# getenforce

Command 'getenforce' not found, but can be installed with:

apt install selinux-utils

# sestatus

Command 'sestatus' not found, but can be installed with:

apt install policycoreutils

# apt install selinux-utils
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  selinux-utils
0 upgraded, 1 newly installed, 0 to remove and 177 not upgraded.
Need to get 122 kB of archives.
After this operation, 642 kB of additional disk space will be used.
Get:1 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu focal/universe amd64 selinux-utils amd64 3.0-1build2 [122 kB]
Fetched 122 kB in 0s (2058 kB/s)     
Selecting previously unselected package selinux-utils.
(Reading database ... 62029 files and directories currently installed.)
Preparing to unpack .../selinux-utils_3.0-1build2_amd64.deb ...
Unpacking selinux-utils (3.0-1build2) ...
Setting up selinux-utils (3.0-1build2) ...
Processing triggers for man-db (2.9.1-1) ...
# getenforce 
Disabled
# 

I think what I can try is to:

  1. Install the package during bootstrapping of the server. I have to edit the cluster configuration yaml and do a rolling restart for new nodes to come up.
  2. Configure the module to be loaded during boot time as you suggested.

Anything else? Any other suggestions?

github-actions[bot] commented 6 months ago

A friendly reminder that this issue had no activity for 30 days.

nalind commented 6 months ago

Is the /dev/fuse device being shared with the pod, both the device node and the device major/minor being present in its device control group's list of allowed devices, or the equivalent? Alternately, since the node appears to be running kernel 5.15, is the kernel's overlayfs a viable option?

github-actions[bot] commented 5 months ago

A friendly reminder that this issue had no activity for 30 days.