containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.34k stars 774 forks source link

[FEATURE REQUEST] Skip unneeded stages from multi-stages #2469

Closed grantral closed 2 years ago

grantral commented 4 years ago

Description

docker/cli#1134

Steps to reproduce the issue:

  1. buildah bud --target backend

Describe the results you received:

STEP 1: FROM alpine AS frontend_builder
STEP 2: RUN touch frontend
STEP 3: FROM alpine AS frontend
STEP 4: COPY --from=frontend_builder /frontend .
STEP 5: FROM alpine AS backend_builder
STEP 6: RUN touch backend
STEP 7: FROM alpine AS backend
STEP 8: COPY --from=backend_builder /backend .

Describe the results you expected: docker/cli#1134 (comment)

STEP 1: FROM alpine AS backend_builder
STEP 2: RUN touch backend
STEP 3: FROM alpine AS backend
STEP 4: COPY --from=backend_builder /backend .

Output of rpm -q buildah or apt list buildah:

buildah-1.15.0-1.fc32.x86_64

Output of buildah version:

Version:         1.15.0
Go Version:      go1.14.3
Image Spec:      1.0.1-dev
Runtime Spec:    1.0.2-dev
CNI Spec:        0.4.0
libcni Version:  
image Version:   5.5.1
Git Commit:      
Built:           Thu Jan  1 06:00:00 1970
OS/Arch:         linux/amd64

Output of podman version if reporting a podman build issue:

Version:      2.0.2
API Version:  1
Go Version:   go1.14.3
Built:        Thu Jan  1 06:00:00 1970
OS/Arch:      linux/amd64

*Output of `cat /etc/release`:**

Fedora release 32 (Thirty Two)
NAME=Fedora
VERSION="32 (Workstation Edition)"
ID=fedora
VERSION_ID=32
VERSION_CODENAME=""
PLATFORM_ID="platform:f32"
PRETTY_NAME="Fedora 32 (Workstation Edition)"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:32"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f32/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=32
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=32
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation
Fedora release 32 (Thirty Two)
Fedora release 32 (Thirty Two)

Output of uname -a:

Linux localhost.localdomain 5.7.8-200.fc32.x86_64 #1 SMP Thu Jul 9 14:34:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# This file is is the configuration file for all tools
# that use the containers/storage library.
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver
driver = "overlay"

# Temporary storage location
runroot = "/var/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the minimum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = false

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,metacopy=on"

# Size is used to set a maximum size of the container image.
# size = ""

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# Size is used to set a maximum size of the container image.
# size = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"

Output of cat Dockerfile:

FROM alpine as frontend_builder

RUN touch frontend

FROM alpine as frontend

COPY --from=frontend_builder /frontend .

FROM alpine as backend_builder

RUN touch backend

FROM alpine as backend

COPY --from=backend_builder /backend .
rhatdan commented 3 years ago

@TomSweeneyRedHat PTAL

TomSweeneyRedHat commented 3 years ago

@grantral please correct me if necessary, but this appears to be part of the Docker buildkit functionality. @rhatdan is this something we should try adding at this point to handle this particular scenario, or should we include it in whatever work would be necessary to provide the buildkit functionality. Note, I'm practically illiterate as far as buildkit goes, I don't know a lot about it.

grantral commented 3 years ago

@TomSweeneyRedHat

but this appears to be part of the Docker buildkit functionality

Yep.

rhatdan commented 3 years ago

Sure we could grab it, if anyone had time to work on it. It would be best if community could open PRs to add this feature.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 3 years ago

@flouthoc PTAL

flouthoc commented 3 years ago

Thanks I'll take a look.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 2 years ago

@flouthoc any progress?

flouthoc commented 2 years ago

Sorry was not able to take a look. I'll take a look in coming days.

flouthoc commented 2 years ago

I think we would need a design change for storing and processing stages. Afaik we don't have a easy way to identify indirect dependency of stages in a multi-stage build.

We would need to store and process stages in a DAG (directed acylic graph) or some sort of dependency tree. We could evaluate each stage in the DAG concurrently and skip the ones which don't lead up to target whether its direct or indirect.

This is just my early proposal for this and i am think buildkit does that same. I was not able to think of any simpler and efficient solution other than this. https://github.com/moby/moby/issues/32550

@vrothberg @nalind @rhatdan @giuseppe @mtrmac Any suggestions.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

joeycumines commented 2 years ago

I'd describe this as a bug.

Parallelization is obviously a nice to have feature, but that's (probably) missing the point of this issue. At the very least, this difference in behavior is currently a blocker for us, to migrate from docker to buildah.

I imagine it would be complex to add support for parallel stages, but surely it wouldn't be particularly problematic to pre-compute the dependency graph, and omit unnecessary stages?

@flouthoc any updates on this? I'm entirely unfamiliar with the codebase, but I might have a crack implementing a (simple) fix, unless there's something in progress.

flouthoc commented 2 years ago

@joeycumines Sure !!! could you please share your approach.

flouthoc commented 2 years ago

@grantral Thanks, this will be out in next buildah release.

lucacome commented 2 years ago

This is not in https://github.com/containers/buildah/releases/tag/v1.26.2 so I assume it will be in the next minor release? v1.27.0?

flouthoc commented 2 years ago

@lucacome Yes buildah 1.26.2 does not contains this feature, following feature should be supported in v1.27.0. I think there should be a plan to release it soon but @rhatdan @TomSweeneyRedHat Could confirm this better.

rhatdan commented 2 years ago

Yes this will be released in the next couple of weeks. By August definitely. Podman rc1 went out this week. We will cut a release of Buildah as soon as we successfully do vendor dance and merge buildah into Podman.

lucacome commented 2 years ago

I've installed podman 4.2.0 which is supposed to include buildah 1.27.0 with the changes from this issue, but the behavior is still the same, am I missing something? Should I open a new issue?

flouthoc commented 2 years ago

@lucacome Works fine for me, see first stage is skipped entirely in the build. Please confirm if you are using the right version, could you share Containerfile and what do you expect to see in build output ?

[root@fedora bin]# cat Dockerfile 
FROM alpine
RUN echo hello

FROM alpine
RUN echo world
[root@fedora bin]# ./podman build --no-cache -t test .
[2/2] STEP 1/2: FROM alpine
[2/2] STEP 2/2: RUN echo world
world
[2/2] COMMIT test
--> 771f01f08fa
Successfully tagged localhost/test:latest
771f01f08fa20cfd1359558121eafe598541f14264c5d5700866c8587e473fc0
[root@fedora bin]# ./podman version
Client:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.3
Git Commit:   7fe5a419cfd2880df2028ad3d7fd9378a88a04f4
Built:        Fri Aug 12 09:09:37 2022
OS/Arch:      linux/amd64
[root@fedora bin]# 
SaurabhAhuja1983 commented 2 years ago

Can we move this code behind a flag. We have use case where we want to build images so that they are available for manifest/deployment. But buildah now skips the unused target images and breaking our builds.

flouthoc commented 2 years ago

@SaurabhAhuja1983 Sure, This was discussed somewhere before as well. would --skip-unused-stage=false work for you ? Could you create a new issue for this ?

SaurabhAhuja1983 commented 2 years ago

Created new issue https://github.com/containers/buildah/issues/4243 Thank you @flouthoc for quickly looking into it and i would appreciate if it can be fixed on priority.