containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.28k stars 766 forks source link

buildah bud doesn't use cached layers when combined with multi stage build and --label somelabel=somevalue #4950

Open Romain-Geissler-1A opened 1 year ago

Romain-Geissler-1A commented 1 year ago

Description

While testing one of our OCI build tool against both docker and podman, I noticed that the tests of my tool are one order of magnitude slower on podman rather than docker. Trying to find a minimal reproducer, it seems related to multi-stage builds combined with labels passed on the command line (if labels are directly written in the containerfile, it's ok). In the following reproducer, I expect the layer with sleep 3 to be cached after a first run, but it is not, leading to always rebuilding images completely. In my real case, this sleep 3 is actually a dnf install command.

Steps to reproduce the issue:

  1. Start a container with the very latest buildah: rgeissler@ncerndobedev6097:~> podman run -t -i --rm --privileged --pull=always quay.io/buildah/upstream
  2. Then create a minimal containerfile:
    
    [root@eb194bf8dfa3 /]# mkdir /build-context
    [root@eb194bf8dfa3 /]# cat > /build-context/Dockerfile <<END_OF_DOCKERFILE
    FROM fedora AS useless_stage_1
    FROM useless_stage_1 AS stage_2

RUN sleep 3 END_OF_DOCKERFILE


3. Build this image once, enable layer caching, and with an explicit label provided on the command line. It builds fine, doing a `sleep 3`:

[root@eb194bf8dfa3 /]# buildah bud --layers=true --label somelabel=somevalue /build-context/ [1/2] STEP 1/1: FROM fedora AS useless_stage_1 Resolved "fedora" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf) Trying to pull registry.fedoraproject.org/fedora:latest... Getting image source signatures Copying blob ad5077952f52 done | Copying config 919a420d29 done | Writing manifest to image destination --> 5c4fc42cf62f [2/2] STEP 1/3: FROM 5c4fc42cf62ffa7e1649f932ad3b90bb0f39739531291fd204b2021007f3a4ce AS stage_2 [2/2] STEP 2/3: RUN sleep 3 --> 999f19301ff6 [2/2] STEP 3/3: LABEL "somelabel"="somevalue" [2/2] COMMIT --> f8c8d0a71f64 f8c8d0a71f64f6c14e46c0de8f61bcabfd4c39a71ea8b99cdd26ccaeda765c56


4. Re-run the exact same command, hoping that this time cache is used and `sleep 3` is not run. However it's not using cached layer for the sleep command:

[root@eb194bf8dfa3 /]# buildah bud --layers=true --label somelabel=somevalue /build-context/ [1/2] STEP 1/1: FROM fedora AS useless_stage_1 --> b075ebbee3a6 [2/2] STEP 1/3: FROM b075ebbee3a6450ba3eebbc14319cfe187e4c98dc47c4cb4facb0be7ba798795 AS stage_2 [2/2] STEP 2/3: RUN sleep 3 --> a7223a7828a9 [2/2] STEP 3/3: LABEL "somelabel"="somevalue" [2/2] COMMIT --> 61615826c9b7 61615826c9b7a5091f111b15e22a9ad3178ae565ea6b3a65f8f96f4516eae490


5. Re-running the same build twice, but without explicit `--label` flag on the command line uses a cached build during the second invokation:

[root@eb194bf8dfa3 /]# buildah bud --layers=true /build-context/ [1/2] STEP 1/1: FROM fedora AS useless_stage_1 --> 919a420d29c6 [2/2] STEP 1/2: FROM 919a420d29c6f5ae0bdc8d1872387d3a878d7f69debce9e24f3f2e0506b2ba0d AS stage_2 [2/2] STEP 2/2: RUN sleep 3 [2/2] COMMIT --> f5684459425d f5684459425d3f88134b202c6dece0c0e2a2b91860d3bc29a536ad5b2c138e7a [root@eb194bf8dfa3 /]# buildah bud --layers=true /build-context/ [1/2] STEP 1/1: FROM fedora AS useless_stage_1 --> 919a420d29c6 [2/2] STEP 1/2: FROM 919a420d29c6f5ae0bdc8d1872387d3a878d7f69debce9e24f3f2e0506b2ba0d AS stage_2 [2/2] STEP 2/2: RUN sleep 3 --> Using cache f5684459425d3f88134b202c6dece0c0e2a2b91860d3bc29a536ad5b2c138e7a --> f5684459425d f5684459425d3f88134b202c6dece0c0e2a2b91860d3bc29a536ad5b2c138e7a


6. Adding direction the `LABEL` inside the original containerfile results in the build using cache correctly immediately at the first run:

[root@eb194bf8dfa3 /]# echo 'LABEL "somelabel"="somevalue"' >>/build-context/Dockerfile [root@eb194bf8dfa3 /]# buildah bud --layers=true /build-context/ [1/2] STEP 1/1: FROM fedora AS useless_stage_1 --> 919a420d29c6 [2/2] STEP 1/3: FROM 919a420d29c6f5ae0bdc8d1872387d3a878d7f69debce9e24f3f2e0506b2ba0d AS stage_2 [2/2] STEP 2/3: RUN sleep 3 --> Using cache f5684459425d3f88134b202c6dece0c0e2a2b91860d3bc29a536ad5b2c138e7a --> f5684459425d [2/2] STEP 3/3: LABEL "somelabel"="somevalue" [2/2] COMMIT --> 2d4b8f439951 2d4b8f43995156779b1d852b08f419347d9607e72a77d8e35d1f2c3d80ea8f49


**Describe the results you received:**

Layer cache is not always used.

**Describe the results you expected:**

In the above described scenarios, I would expect that layer cache is always used.

**Output of `rpm -q buildah` or `apt list buildah`:**

buildah-1.31.0-1.20230731174246479315.main.43.g8af2dc4ea.x86_64


**Output of `buildah version`:**

Version: 1.32.0-dev Go Version: go1.20.6 Image Spec: 1.1.0-rc.4 Runtime Spec: 1.1.0 CNI Spec: 1.0.0 libcni Version: image Version: 5.27.0-dev Git Commit: Built: Mon Jul 31 17:47:23 2023 OS/Arch: linux/amd64 BuildPlatform: linux/amd64


**Output of `cat /etc/*release`:**

Fedora release 38 (Thirty Eight) NAME="Fedora Linux" VERSION="38 (Container Image)" ID=fedora VERSION_ID=38 VERSION_CODENAME="" PLATFORM_ID="platform:f38" PRETTY_NAME="Fedora Linux 38 (Container Image)" ANSI_COLOR="0;38;2;60;110;180" LOGO=fedora-logo-icon CPE_NAME="cpe:/o:fedoraproject:fedora:38" DEFAULT_HOSTNAME="fedora" HOME_URL="https://fedoraproject.org/" DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f38/system-administrators-guide/" SUPPORT_URL="https://ask.fedoraproject.org/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Fedora" REDHAT_BUGZILLA_PRODUCT_VERSION=38 REDHAT_SUPPORT_PRODUCT="Fedora" REDHAT_SUPPORT_PRODUCT_VERSION=38 SUPPORT_END=2024-05-14 VARIANT="Container Image" VARIANT_ID=container Fedora release 38 (Thirty Eight) Fedora release 38 (Thirty Eight)


**Output of `uname -a`:**

Linux eb194bf8dfa3 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 30 07:36:03 EDT 2022 x86_64 GNU/Linux

Romain-Geissler-1A commented 1 year ago

Note: in my case I have this issue in real life, using podman 4.4. from RHEL 9.

flouthoc commented 1 year ago

Thanks for reporting I can reproduce this issue.

flouthoc commented 1 year ago

@Romain-Geissler-1A Issue is happening because history is being created incorrectly for the first stage, meanwhile till I diagnose the root cause and create a patch for this a workaround is

FROM fedora AS useless_stage_1
RUN echo "dummy stmt"

FROM useless_stage_1 AS stage_2
RUN sleep 3
Romain-Geissler-1A commented 1 year ago

Thanks for the hint.

Actually I had already updated my build tool (it's a tool that generates Dockerfile automatically, that's why it sometimes generate degenerated cases like this) so that we no longer pass --label flags on the command line, but we directly write LABEL statements directly in the generated dockerfile.

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 1 year ago

@flouthoc any update?

github-actions[bot] commented 11 months ago

A friendly reminder that this issue had no activity for 30 days.