containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.46k stars 786 forks source link

sbom scanning fails to remount context #5617

Open keturn opened 4 months ago

keturn commented 4 months ago

Description

buildah build --sbom trivy fails with an "operation not permitted" error.

Steps to reproduce the issue:

run buildah under gitlab-runner with its executor pointed to rootless podman… yeah, I know, that might not be the most convenient reproduction plan.

[link to failing build]

Describe the results you received:

[5/5] COMMIT registry.gitlab.com/keturn/pytorch-container:main
error running subprocess: remounting "/var/tmp/buildah2005083198/mnt/rootfs/.context0" in mount namespace with flags 0x1 instead of 0x0: operation not permitted
Error: committing container for step {…}: running scanning command [trivy filesystem -q /.rootfs --format cyclonedx --output /.scans/scan0.json]: exit status 1

Describe the results you expected:

build completes without error

Output of rpm -q buildah or apt list buildah:

buildah-1.36.0-1.fc40.x86_64

Output of buildah version:

Version:         1.36.0
Go Version:      go1.22.3
Image Spec:      1.1.0
Runtime Spec:    1.2.0
CNI Spec:        1.0.0
libcni Version:  
image Version:   5.31.0
Git Commit:      
Built:           Mon May 27 13:11:54 2024
OS/Arch:         linux/amd64
BuildPlatform:   linux/amd64

Output of podman version if reporting a podman build issue:

not a podman build issue, but buildah is executing under the podman runtime:

Client:       Podman Engine
Version:      5.2.0-dev
API Version:  5.2.0-dev
Go Version:   go1.22.2
Git Commit:   b8d95a5893572b37c8257407e964ad06ba87ade6
Built:        Tue Jun 18 12:28:29 2024
OS/Arch:      linux/amd64

-dev? oh geez

*Output of `cat /etc/release`:**

It's quay.io/buildah/stable:latest.

Fedora 40 ``` Fedora release 40 (Forty) NAME="Fedora Linux" VERSION="40 (Container Image)" ID=fedora VERSION_ID=40 VERSION_CODENAME="" PLATFORM_ID="platform:f40" PRETTY_NAME="Fedora Linux 40 (Container Image)" ANSI_COLOR="0;38;2;60;110;180" LOGO=fedora-logo-icon CPE_NAME="cpe:/o:fedoraproject:fedora:40" DEFAULT_HOSTNAME="fedora" HOME_URL="https://fedoraproject.org/" DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f40/system-administrators-guide/" SUPPORT_URL="https://ask.fedoraproject.org/" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Fedora" REDHAT_BUGZILLA_PRODUCT_VERSION=40 REDHAT_SUPPORT_PRODUCT="Fedora" REDHAT_SUPPORT_PRODUCT_VERSION=40 SUPPORT_END=2025-05-13 VARIANT="Container Image" VARIANT_ID=container Fedora release 40 (Forty) Fedora release 40 (Forty) ```

Output of uname -a:

Linux f53aa9e40bf8 5.15.0-113-generic #123-Ubuntu SMP Mon Jun 10 08:16:17 UTC 2024 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

mostly the container's defaults, just graphroot changed.

[storage]
driver = "overlay"
runroot = "/run/containers/storage"
graphroot = "/cache/containers/storage"
[storage.options]
additionalimagestores = [
"/var/lib/shared",
"/usr/lib/containers/storage",
]
pull_options = {enable_partial_images = "true", use_hard_links = "false", ostree_repos=""}
[storage.options.overlay]
mount_program = "/usr/bin/fuse-overlayfs"
mountopt = "nodev,fsync=0"
github-actions[bot] commented 3 months ago

A friendly reminder that this issue had no activity for 30 days.

keturn commented 3 months ago

stalebot, I can still reproduce this failure

github-actions[bot] commented 2 months ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 2 months ago

@nalind PTAL

nalind commented 2 months ago

1.36 predates #5544, and that changed some of the code paths that would be involved here. It corrected an error that cropped up when we tried to make a bind-mount read-only if it had certain other mount flags already set on its source that weren't being re-requested during the remount call.

A quick skim of the "docker" executor code in gitlab-runner doesn't turn up any unusual flags being used when creating containers, so I'm guessing there are volumes or bind mounts involved, because those can provide surprises of the type that affected secrets, and I wouldn't be that surprised if the build context was provided in a volume, as it's what the OpenShift builder does.

If I run an unprivileged buildah container as my unprivileged user on a Fedora 40 system like so:

mkdir /tmp/tests
cat > /tmp/tests/Dockerfile << EOF
FROM registry.fedoraproject.org/fedora:40
RUN dnf -y distro-sync
EOF
podman run --device /dev/fuse --rm -it -v /tmp/tests:/root/buildcontext:Z -v /var/lib/containers quay.io/buildah/stable:v1.36.0 sh -c 'dnf -y install trivy; buildah build --layers --sbom trivy --sbom-output=/tmp/sbom.json /root/buildcontext'

...I get the described error with quay.io/buildah/stable:v1.36.0, but not with quay.io/buildah/stable:v1.37.0.

That fix is a little mistaken in that it's setting some bits in a mount(2) call's mountflags parameter by directly using the f_flags value a statfs(2) call returned, and those values aren't always interchangeable. For flags like "nodev"/"nosuid"/"noexec", the bits to set are the same ones we get back, but for flags like "relatime" they're quite different. That said, I didn't spot this at the time, and I'm leaning toward keeping the current code if it isn't actively breaking for anyone.

So, I guess the important question is: are people still hitting this problem with the 1.37 image?

github-actions[bot] commented 1 month ago

A friendly reminder that this issue had no activity for 30 days.