containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.37k stars 781 forks source link

buildah bud fails on RUN - go build #1592

Closed vtereso closed 5 years ago

vtereso commented 5 years ago

Description

Using the following Dockerfile, I am unable to get successfully run the buildah bud command:

# Run using --privileged
FROM quay.io/openshift-pipeline/buildah
ARG GIT_SOURCE="https://github.com/a-roberts/knative-helloworld"
ENV TLS_VERIFY="true"
ENV CONTEXT_PATH="/workspace/git-source"
ENV DOCKERFILE_PATH="/workspace/git-source/Dockerfile"
ENV TAG="testing"
WORKDIR ${CONTEXT_PATH}
# This is the above $GIT_SOURCE repo on my local
COPY git-source/ .
VOLUME "/var/lib/containers"
ENTRYPOINT ["/bin/sh", "-c", "buildah build-using-dockerfile --tls-verify=${TLS_VERIFY} --layers -f ${DOCKERFILE_PATH} -t ${TAG} -- ${CONTEXT_PATH}"]

The resulting log is:

STEP 1: FROM golang AS builder
Getting image source signatures
Copying blob sha256:c5e155d5a1d130a7f8a3e24cee0d9e1349bff13f90ec6a941478e558fde53c14
Copying blob sha256:221d80d00ae9675aad24913aacbadfac1ce8b7084f9765a6c0813486082c5c69
Copying blob sha256:4250b3117dca5e14edc32ebf1366cd54e4cda91f17610b76c504a86917ff8b95
Copying blob sha256:3b7ca19181b24b87e24423c01b490633bc1e47d2fcdc1987bf2e37949d6789b5
Copying blob sha256:aa24759e848fee3ef333af3dd3ae951eb042e8cd20b5fc0e28a2f3c52cc7e25f
Copying blob sha256:927e9eaeed1922f626e8a34f9a21b6029f36d4112cbb04dbdbd9065e107a59cb
Copying blob sha256:66293f4dacbd8884954f2c9332298ace627830801c3b484ba89ca424c619f374
Copying config sha256:7ced090ee82ee77beabd76ad1ba3b167acd8609b0b10c4ef46cee3ddf6e6fa5f
Writing manifest to image destination
Storing signatures
STEP 2: WORKDIR /go/src/github.com/knative/docs/helloworld
--> 9c3aa441b8638080568faaf59b277d07cb861f48195a3fbeec558dc5b01e2b2b
STEP 3: FROM 9c3aa441b8638080568faaf59b277d07cb861f48195a3fbeec558dc5b01e2b2b AS builder
STEP 4: COPY . .
STEP 5: FROM bf5d138956b0270fb67c8e0d61d134d54f75639beb3acb9e71f384676235c88a AS builder
--> bf5d138956b0270fb67c8e0d61d134d54f75639beb3acb9e71f384676235c88a
STEP 6: RUN CGO_ENABLED=0 GOOS=linux go build -v -o helloworld
build cache is required, but could not be located: GOCACHE is not defined and neither $XDG_CACHE_HOME nor $HOME are defined
subprocess exited with status 1
subprocess exited with status 1
error building at step {Env:[PATH=/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin GOLANG_VERSION=1.12.5 GOPATH=/go] Command:run Args:[CGO_ENABLED=0 GOOS=linux go build -v -o helloworld] Flags:[] Attrs:map[] Message:RUN CGO_ENABLED=0 GOOS=linux go build -v -o helloworld Original:RUN CGO_ENABLED=0 GOOS=linux go build -v -o helloworld}: exit status 1

I have gotten Kaniko to build this repository. Curious if this issue is regarding some flags I have no set properly to the buildah command. I have tried setting the $BUILDAH_ISOLATION env and the corresponding --isolation flag (saw on another issue that could do something). I have run this on my 3.11 OKD cluster and locally on macOS (although still through the quay images) and experience the same error.

Steps to reproduce the issue:

  1. Build above Dockerfile
  2. Run using --privileged

Describe the results you received: Error during go build regarding ENV $GOCACHE/$HOME not being set (which they are?)

Describe the results you expected: Clean build

Output of rpm -q buildah or apt list buildah:

package buildah is not installed

Output of buildah version:

buildah: command not found...

Output of podman version if reporting a podman build issue:

podman: command not found...

*Output of `cat /etc/release`:**

NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME="OpenShift Enterprise"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.6:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.6"
Red Hat Enterprise Linux Server release 7.6 (Maipo)
Red Hat Enterprise Linux Server release 7.6 (Maipo)

Output of uname -a:

Linux tereso-okd 3.10.0-957.12.1.el7.x86_64 #1 SMP Wed Mar 20 11:34:37 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# storage.conf is the configuration file for all tools
# that share the containers/storage libraries
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver
driver = "overlay"

# Temporary storage location
runroot = "/var/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Size is used to set a maximum size of the container image.  Only supported by
# certain container storage drivers.
size = ""

# OverrideKernelCheck tells the driver to ignore kernel checks based on kernel version
override_kernel_check = "true"

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to UIDs/GIDs as they should appear outside of the container, and
# the length of the range of UIDs/GIDs.  Additional mapped sets can be listed
# and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and the a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped container-level ID,
# until all of the entries have been used for maps.
#
# remap-user = "storage"
# remap-group = "storage"

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# mountopt specifies extra mount options used when mounting the thin devices.
# mountopt = ""

# use_deferred_removal Marking device for deferred removal
# use_deferred_removal = "True"

# use_deferred_deletion Marking device for deferred deletion
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"
vrothberg commented 5 years ago

Hi @vtereso, thanks for taking the time to open the issue. While we are having a look at it, would you mind adding the requested data to the issue template (Output of ... etc.)? It could help understanding and tracking down the issue.

vrothberg commented 5 years ago

My suspicion is that Kaniko sets $HOME.

vtereso commented 5 years ago

Hi @vtereso, thanks for taking the time to open the issue. While we are having a look at it, would you mind adding the requested data to the issue template (Output of ... etc.)? It could help understanding and tracking down the issue.

These outputs are to be filled using the responses from the quay.io/openshift-pipeline/buildah container?

vtereso commented 5 years ago

My suspicion is that Kaniko sets $HOME.

I don't understand what segment of the command fails to have $HOME because the alpine, kaniko, buildah, etc. all have $HOME specified so my assumption was that some layers that buildah creates don't have these env so it fails.

vrothberg commented 5 years ago

These outputs are to be filled using the responses from the quay.io/openshift-pipeline/buildah container?

Ideally both, the one used to build the image and the version inside the image.

I don't understand what segment of the command fails to have $HOME because the alpine, kaniko, buildah, etc. all have $HOME specified so my assumption was that some layers that buildah creates don't have these env so it fails.

I concur. Even setting via --env does not change anything.

vrothberg commented 5 years ago

I could trim it down a bit more:

quay.io/openshift-pipeline/buildah and all other images (as previously mentioned) have the $HOME set. When running buildah inside the container, $HOME is not set anymore. That can easily be reproduced by using a simple Dockerfile such as:

FROM golang
RUN echo "HOME=$HOME"

EDIT: this Dockerfile must be built inside the container.

@TomSweeneyRedHat @rhatdan @nalind, any suspicion?

vrothberg commented 5 years ago

Odd side note:

# #inside the container
# buildah from alpine
# buildah run alpine-working-container echo $HOME
/root

EDIT: That certainly came from too early substitution. The working-container's $HOME is empty.

TomSweeneyRedHat commented 5 years ago

@vtereso I'm still a little confused. Can you add the exact Bulidah commands that you used to run into this error for your original log output please? I don't expect buildah run to work for you if for no other reason the ENTRYPOINT is ignored by the bulidah run command. However, Podman run should work.

FWIW, I did after cloning your repo

# mkdir /var/lib/mycontainer
# cd ~/workspace
# buildah bud -t tom -f ~/Dockerfile.badhome .
# podman run -v /var/lib/mycontainer:/var/lib/containers:Z --device /dev/fuse:rw tom

That failed like yours did. As @vrothberg noted, the buildah bud command seems to not be able to find the HOME envar when it is being run inside of a container rather than a host. I'm not sure why that is.

For a work around, I edited your workspace/git-source/Dockerfile adding the below ENV line:

FROM golang as builder
ENV HOME=/root

With that in place, things worked.

vtereso commented 5 years ago

I could trim it down a bit more:

quay.io/openshift-pipeline/buildah and all other images (as previously mentioned) have the $HOME set. When running buildah inside the container, $HOME is not set anymore. That can easily be reproduced by using a simple Dockerfile such as:

FROM golang
RUN echo "HOME=$HOME"

EDIT: this Dockerfile must be built inside the container.

@TomSweeneyRedHat @rhatdan @nalind, any suspicion?

This is the problem that I am/was running into. I see that adding an ENV within the Dockerfile seems to fix things, and although that isn't optimal, it gets me over this hurdle.

EDIT: Seems to also fail for me

Sending build context to Docker daemon  180.7kB
Step 1/11 : FROM quay.io/openshift-pipeline/buildah
 ---> 90833879ccc1
Step 2/11 : ARG GIT_SOURCE="https://github.com/a-roberts/knative-helloworld"
 ---> Using cache
 ---> 124d7ac4b4ae
Step 3/11 : ENV HOME="/root"
 ---> Running in d730ef101246
Removing intermediate container d730ef101246
 ---> 77a358eba2e4
Step 4/11 : ENV TLS_VERIFY="true"
 ---> Running in 7fc3f8bec5fc
Removing intermediate container 7fc3f8bec5fc
 ---> 2ce064583923
Step 5/11 : ENV CONTEXT_PATH="/workspace/git-source"
 ---> Running in 8cd155235722
Removing intermediate container 8cd155235722
 ---> 27a742500cf0
Step 6/11 : ENV DOCKERFILE_PATH="/workspace/git-source/Dockerfile"
 ---> Running in 3fba4913445e
Removing intermediate container 3fba4913445e
 ---> 55821edd21af
Step 7/11 : ENV TAG="testing"
 ---> Running in 83394b746b2c
Removing intermediate container 83394b746b2c
 ---> d0cb5aab27c3
Step 8/11 : WORKDIR ${CONTEXT_PATH}
 ---> Running in 1a914cf30f61
Removing intermediate container 1a914cf30f61
 ---> a0ec781e4e68
Step 9/11 : COPY git-source/ .
 ---> fc3497353168
Step 10/11 : VOLUME "/var/lib/containers"
 ---> Running in 8f41e4d91768
Removing intermediate container 8f41e4d91768
 ---> 9980307f34f0
Step 11/11 : ENTRYPOINT ["/bin/sh", "-c", "buildah build-using-dockerfile --tls-verify=${TLS_VERIFY} --layers -f ${DOCKERFILE_PATH} -t ${TAG} -- ${CONTEXT_PATH}"]
 ---> Running in aba26ecb4ccf
Removing intermediate container aba26ecb4ccf
 ---> 57ba001db6f6
Successfully built 57ba001db6f6
Successfully tagged buildah-l:latest
vincents-mbp:kaniko_debug Vincent.DeSousa.Tereso@ibm.com$ docker run --privileged buildah-l
STEP 1: FROM golang AS builder
Getting image source signatures
Copying blob sha256:c5e155d5a1d130a7f8a3e24cee0d9e1349bff13f90ec6a941478e558fde53c14
Copying blob sha256:221d80d00ae9675aad24913aacbadfac1ce8b7084f9765a6c0813486082c5c69
Copying blob sha256:4250b3117dca5e14edc32ebf1366cd54e4cda91f17610b76c504a86917ff8b95
Copying blob sha256:3b7ca19181b24b87e24423c01b490633bc1e47d2fcdc1987bf2e37949d6789b5
Copying blob sha256:aa24759e848fee3ef333af3dd3ae951eb042e8cd20b5fc0e28a2f3c52cc7e25f
Copying blob sha256:927e9eaeed1922f626e8a34f9a21b6029f36d4112cbb04dbdbd9065e107a59cb
Copying blob sha256:66293f4dacbd8884954f2c9332298ace627830801c3b484ba89ca424c619f374
Copying config sha256:7ced090ee82ee77beabd76ad1ba3b167acd8609b0b10c4ef46cee3ddf6e6fa5f
Writing manifest to image destination
Storing signatures
STEP 2: WORKDIR /go/src/github.com/knative/docs/helloworld
STEP 3: FROM 6c430deef52c8cbfefa2f0d866083b4ea9c4e8af3970c5bff498ac7b2f47cf65 AS builder
--> 6c430deef52c8cbfefa2f0d866083b4ea9c4e8af3970c5bff498ac7b2f47cf65
STEP 4: COPY . .
STEP 5: FROM 286dc25c758c6418a7e48b0d0c2dfdf8089aa0d6d49812b7d98930f8108a420d AS builder
--> 286dc25c758c6418a7e48b0d0c2dfdf8089aa0d6d49812b7d98930f8108a420d
STEP 6: RUN CGO_ENABLED=0 GOOS=linux go build -v -o helloworld
build cache is required, but could not be located: GOCACHE is not defined and neither $XDG_CACHE_HOME nor $HOME are defined
vtereso commented 5 years ago

I took down my environment because I was using the same RHEL VM to run Minishift/OKD (it can only handle so much). Give me a bit and I will update the thread to properly reply.

TomSweeneyRedHat commented 5 years ago

Thanks @vtereso .

Please do include any buildah/podman commands that you use along with the output as you've been doing. It's hard to guess what's what otherwise. For your latest failure, I'm not seeing a ENV HOME in the output, although ideally, you shouldn't need to specify that, it looks like it's required at the moment.

No thoughts at the moment, I'm just playing and looking at different scenarios in hopes of narrowing it down. The login process should be setting that envvar, so perhaps it's not getting invoked properly? But if so, I'd expect the same issue for the initial container...

nalind commented 5 years ago

runc sets $HOME if the configuration it receives doesn't include a value, but we currently don't when we're using chroot isolation. Fixing this probably involves extending pkg/chrootuser to look up home directory locations, and if the spec doesn't include a value for HOME, having chroot set it to the value it finds, or / if no value is found.

TomSweeneyRedHat commented 5 years ago

Per usual, @nalind is spot on. I just tried:

# buildah bud --isolation=oci -t tom -f ~/Dockerfile.badhome .
# podman run -v /var/lib/mycontainer:/var/lib/containers:Z --device /dev/fuse:rw tom

and it seemed to work for me. @vtereso can you try adding --isolation=oci to your build command and see how things go for you?

@nalind I'll take a look at changing chroot tomorrow, holler if I shouldn't.

vrothberg commented 5 years ago

Right, @nalind nailed it.

sh-4.4# buildah from golang
golang-working-container-3
sh-4.4# buildah run golang-working-container-3 sh
# echo $GOPATH
/go

It's really just $HOME, all other variables in the spec are properly set.

vbatts commented 5 years ago

The op is using chroot isolation due to being inside a container already.

On Wed, May 15, 2019, 02:50 Valentin Rothberg notifications@github.com wrote:

Right, @nalind https://github.com/nalind nailed it.

sh-4.4# buildah from golang golang-working-container-3 sh-4.4# buildah run golang-working-container-3 sh

echo $GOPATH

/go

It's really just $HOME, all other variables in the spec are properly set.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/containers/buildah/issues/1592?email_source=notifications&email_token=AAAQL2OIPNYZGNVH5UUV7U3PVOXDFA5CNFSM4HMTGEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVNV5PQ#issuecomment-492527294, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQL2JFZ4F5QFWCBNCBYCLPVOXDFANCNFSM4HMTGECQ .

vtereso commented 5 years ago

@TomSweeneyRedHat In my last response I provided the $HOME variable as seen within Step 3, but that did not resolve the issue related to the RUN ... go build layer. Setting the isolation flag (--isolation=oci) did fix things 😄. Was almost hoping there was an error because I fiddled with most all the buildah bud flags, but seemed to have missed that one 😫 . I may have tried it, but perhaps not that setting 🤔 @vbatts Can you explain why isolation is defaulted to chroot rather than oci? https://github.com/containers/buildah/blob/master/docs/buildah-bud.md#options <- This seems to* specify that oci would be the default and would make sense since it is an image, which would run on Kube in most instances?

vbatts commented 5 years ago

the image you're running, is running buildah inside the container. https://github.com/containers/buildah/blob/master/buildahimage/stable/Dockerfile#L27

vtereso commented 5 years ago

@vbatts I guess my question is more about the differences in isolation levels and what that entails since I am not familiarized. IIUC, the buildah image is by definition always a container (and buildah commands create containers one level further) and if isolation defaults to chroot, at least for this use case, it will error?

TomSweeneyRedHat commented 5 years ago

I'll let @vbatts or @nalind talk about the differences in levels as I'm not very well versed. However, I'm working on putting together a fix so that $HOME will be defined when using chrooot isolation and that should hopefully cure the problem.