containers / build

another build tool for container images (archived, see https://github.com/rkt/rkt/issues/4024)
Apache License 2.0
342 stars 80 forks source link

Container target terminated by signal KILL #162

Open blalor opened 8 years ago

blalor commented 8 years ago

I'm running into the above error when running acbuild in my CI system (GoCD), but not when running it by hand. I get no other meaningful information:

11:56:39.540 Adding dependency "aci.example.com/base/centos7,version=765cc162-17"
11:56:39.547 Running: [yum install -y java-1.8.0-openjdk-headless.x86_64]
11:56:40.310 meta tag not found on aci.example.com/base/centos7: expected a 200 OK got 403
11:56:40.713 Downloading aci.example.com/base/centos7:  0 B/122 MB
11:56:40.783 Downloading aci.example.com/base/centos7:  16.4 KB/122 MB
11:56:41.791 Downloading aci.example.com/base/centos7:  9.97 MB/122 MB
11:56:42.791 Downloading aci.example.com/base/centos7:  32.3 MB/122 MB
11:56:43.792 Downloading aci.example.com/base/centos7:  55.3 MB/122 MB
11:56:44.793 Downloading aci.example.com/base/centos7:  77.6 MB/122 MB
11:56:45.794 Downloading aci.example.com/base/centos7:  100 MB/122 MB 
11:56:46.770 Downloading aci.example.com/base/centos7:  122 MB/122 MB 
11:56:46.771 
11:57:10.291 environ: [GO_PIPELINE_COUNTER=19 GO_TRIGGER_USER=anonymous GO_PIPELINE_LABEL=d7b42638-19 SHELL=/bin/bash TERM=unknown GO_PIPELINE_NAME=aci-go-server AGENT_STARTUP_ARGS=-Dcruise.console.publish.interval=10 -Xms128m -Xmx256m    -Djava.security.egd=file:/dev/./urandom USER=root SUDO_USER=go GO_JOB_NAME=build SUDO_UID=994 USERNAME=root PATH=/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin GO_TO_REVISION_GIT=d7b4263823ae075ce39a9720b9bf3f61df231c0b PWD=/var/lib/go-agent/pipelines/aci-go-server LOG_DIR=/var/log/go-agent LANG=en_US.UTF-8 XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt GO_STAGE_COUNTER=2 HOME=/root SHLVL=2 SUDO_COMMAND=/bin/sh -c ./build.sh $( jq -r .name parent_aci/manifest ),version=$( jq -r '.labels[] | select(.name == "version") | .value' parent_aci/manifest ) aci.example.com/base/go-server ${GO_PIPELINE_LABEL} GO_STAGE_NAME=build-aci GO_FROM_REVISION_GIT=d7b4263823ae075ce39a9720b9bf3f61df231c0b GO_SERVER_URL=https://10.112.114.116:8154/go/ LOGNAME=root GO_DEPENDENCY_LABEL_PARENT_ACI=765cc162-17 SUDO_GID=989 GO_DEPENDENCY_LOCATOR_PARENT_ACI=centos-aci-builder/17/upload-acis/1 GO_REVISION_GIT=d7b4263823ae075ce39a9720b9bf3f61df231c0b _=/usr/local/bin/acbuild]
11:57:10.291 running command: [systemd-nspawn -D .acbuild/target --register=no --setenv PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin /usr/bin/yum install -y java-1.8.0-openjdk-headless.x86_64]
11:57:10.294 Container target terminated by signal KILL.
11:57:10.296 
11:57:10.297 run: exit status 1

This is with acbuild from the current master, and with a couple of debug statements added to output the systemd-nspawn command (I removed --quiet, too) and the environment.

The host OS is CentOS 7.2 with systemd-219-19.el7.x86_64. The base/centos7 dependency is the same.

systemd-nspawn is such a pain in the ass. Any ideas how I can troubleshoot this? Could it have something to do with not having a controlling tty?

cgonyeo commented 8 years ago

I've seen this issue before with yum (and apt-get I think?), and I have no clue what causes it. I saw this while running acbuild in a tty, so I kind of doubt that's the issue there. I'll dig into this next time I can spare some cycles.

If I can't figure out the cause of this, https://github.com/appc/acbuild/issues/13 (if it ever gets done) might also end up shining some light on this.

cgonyeo commented 8 years ago

Your logs are implying you're getting to the point of exec'ing systemd-nspawn in lib/run.go, so it's unlikely that this will fix it, but would you mind seeing if https://github.com/appc/acbuild/pull/171 fixes this for you?

cgonyeo commented 8 years ago

I'm having good luck reproducing acbuild bugs by replicating people's setup. What distro is this on, and any chance you'd be willing to share your acbuild script?

blalor commented 8 years ago

171 didn't seem to make a difference.

I'm building on stock CentOS 7.2 with updates; currently kernel 3.10.0-327.10.1.el7.x86_64 and systemd-219-19.el7_2.4.x86_64.

I'm avoiding systemd-nspawn entirely by using a shell script replacement.

This is one of my build scripts, but without my systemd-nspawn replacement, it just dies at the first acbuild run:

#!/usr/bin/env bash

## must be run as root, or "acbuild run" will fail

set -e -u -o pipefail

## sudo fucks with our PATH, via secure_path
export PATH=/usr/local/bin:$PATH

acbuild="acbuild --debug"

if [ $# -ne 3 ]; then
    echo "usage: $0 <parent aci spec> <aci name> <version>"
    exit 1
fi

parent_aci_spec="${1}"
aci_name="${2}"
aci_version="${3}"

cur_dir=${PWD}
dest_dir="${cur_dir}/work"
mkdir -p "${dest_dir}"

function cleanup() {
    EXIT=$?

    $acbuild end

    exit ${EXIT}
}

trap cleanup EXIT

$acbuild begin

rootfs=".acbuild/currentaci/rootfs"

$acbuild annotation add created "$( date --rfc-3339=ns | tr ' ' 'T' )"
$acbuild set-name "${aci_name}"

$acbuild label add version "${aci_version}"
$acbuild label add os linux
$acbuild label add arch amd64

$acbuild dependency add "${parent_aci_spec}"

cp -r src ${rootfs}/src/
$acbuild run -- /src/setup.sh

$acbuild set-exec /launch

# $acbuild environment add COLLECTD_HOST
# $acbuild environment add COLLECTD_PORT
# $acbuild environment add COLLECTD_USERNAME
# $acbuild environment add COLLECTD_PASSWORD
# $acbuild environment add LOG_COURIER_DEST
# $acbuild environment add LOG_COURIER_TAGS

rm -rfv \
    ${rootfs}/etc/localtime \
    ${rootfs}/etc/resolv.conf \
    ${rootfs}/usr/share/zoneinfo/

mkdir -p "$( dirname "${dest_dir}/${aci_name}" )"
$acbuild write "${dest_dir}/${aci_name}"

## generate XML (aieeee!) describing what we just built
## this is mainly for GoCD, so we can set properties based on this artifact
{
    echo '<?xml version="1.0" encoding="utf-8"?>'
    echo '<aci>'
    echo "    <name>${aci_name}</name>"
    echo "    <version>${aci_version}</version>"
    echo '</aci>'
} > "${dest_dir}/aci.xml"

## also, since I don't yet have a handle on how these dependencies are going to
## be chained, capture the manifest.
$acbuild cat-manifest > "${dest_dir}/manifest"

## we're probably root, anyway, but check just in case
if [ "${EUID}" -eq 0 ]; then
    ## ensure the calling user can remove the generated files
    chown -R "$( stat --format='%U:%G' . )" "${dest_dir}"
fi
cgonyeo commented 8 years ago

I installed Centos 7.2, and am on the same versions of the kernel and systemd as you are. I made a src/setup.sh that just runs yum install -y java-1.8.0-openjdk-headless.x86_64, and ran your script with the current master of acbuild.

Still can't reproduce this, and I have no clue why systemd-nspawn isn't working out for you. Maybe this just means it's time to get around to https://github.com/appc/acbuild/issues/13 finally, so it'll be easy for you to swap out systemd-nspawn for something else.

bcg62 commented 8 years ago

I'm now having this same issue when using acbuild v0.4.0, it was not present in v0.3.1.

CentOS Linux release 7.2.1511 (Core) 3.10.0-327.el7.x86_64 systemd-219-19.el7.x86_64

Running: [/bin/sh -c (yum clean all)]
Warning: "/bin/sh" is a symlink, which systemd-nspawn version 219 might error on
Container rootfs terminated by signal KILL.
run: non-zero exit code: 1
Ending the build