containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.45k stars 785 forks source link

buildah bud needs --init option to workaround pid 1 issue during dockerfile build #1961

Closed mderoy closed 3 years ago

mderoy commented 5 years ago

Description The PID 1 reaping issue described here https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem is probably the nastiest docker quirk I've ever seen. Whats worse, this PID 1 reaping issue can affect applications that are run in the Dockerfile during the container build itself! We had a tool that was waiting for a PID to go away, which never did because it became a zombie process and was never reaped (pid 1 was not reaping it) you can work around the issue in your Dockerfile by using dumb-init or tini and running using it. This will make dumb-init/tini pid #1, and zombie process will be reaped RUN easy_install pip RUN pip install dumb-init RUN ["/usr/bin/dumb-init", "bash","-c", "/tmp/buildApp.sh"]

docker itself has a --init flag which uses tini, but 'buildah bud' does not have such a flag, that is an easier workaround for those in the docker world since every command run in the dockerfile is not run as pid 1

Steps to reproduce the issue:

  1. Reproduce the PID #1 zombie issue described in the link above while building the container In the dockerfile itself and have your script loop waiting for the zombie pid to disappear...this will cause a hang since pid1 does not reap the zombie process.

Describe the results you received: build toolchain hung waiting for process to go away, but zombie process never gets cleaned up

Describe the results you expected: no hang

Output of rpm -q buildah or apt list buildah:

buildah-1.9.0-2.el7.x86_64

Output of buildah version:

buildah version 1.9.0 (image-spec 1.0.0, runtime-spec 1.0.0)

Output of podman version if reporting a podman build issue:

Version:            1.4.4
RemoteAPI Version:  1
Go Version:         go1.10.3
OS/Arch:            linux/amd64

*Output of `cat /etc/release`:**

Red Hat Enterprise Linux Server release 7.7 (Maipo)

Output of uname -a:

Linux <myhostname> 3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# storage.conf is the configuration file for all tools
# that share the containers/storage libraries
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver
driver = "overlay"

# Temporary storage location
runroot = "/var/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Size is used to set a maximum size of the container image.  Only supported by
# certain container storage drivers.
size = ""

# OverrideKernelCheck tells the driver to ignore kernel checks based on kernel version
override_kernel_check = "true"

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to UIDs/GIDs as they should appear outside of the container, and
# the length of the range of UIDs/GIDs.  Additional mapped sets can be listed
# and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and the a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped container-level ID,
# until all of the entries have been used for maps.
#
# remap-user = "storage"
# remap-group = "storage"

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# mountopt specifies extra mount options used when mounting the thin devices.
# mountopt = ""

# use_deferred_removal Marking device for deferred removal
# use_deferred_removal = "True"

# use_deferred_deletion Marking device for deferred deletion
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"
rhatdan commented 4 years ago

I don't see docker build --init? I do see docker run --init.

Podman run --init exists as well?

mderoy commented 4 years ago

I'll provide more details :) (Also greetings from one town over in Littleton :D)

We have a legacy application which we've now shipped a containerized solution for. Our application has an installer, and in production our container build just runs this installer....we're using ubi-init so that our legacy application can work with the systemd services it needs to run.

In development though, we have the desire to build a "development container" which rather than going through the installer, would pull our source code and perform a build (so we have a full development environment where we can change source code, build, etc in our container). unfortunately our build will hang forever because somewhere in our build we're waiting for some build tool (if I remember correctly fakeroot) to finish by checking that the PID has been cleaned up

The issue is, there is no option to do buildah bud --init so that we have an init process while we're building our container image layer by layer.

Aside from altering our build process, our options are then limited to

Obviously altering our build process to workaround this PID check would be the quickest way for us to workaround this issue, but I'd imagine other developers might face this zombie reaping issue during their container builds when porting legacy applications, so such an argument to buildah bud may be useful.

rhatdan commented 4 years ago

Well my usual response to something like this is: If you want to create a PR, I am sure we would consider it.

rhatdan commented 3 years ago

Since no one from community has stepped up to work on this, I am going to close in one month.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.