gentoo / gentoo-docker-images

[MIRROR] Common effort to get an official and automated gentoo base docker container
https://gitweb.gentoo.org/proj/docker-images.git
GNU General Public License v2.0
322 stars 89 forks source link

Recent systemd images broken? #106

Closed pgrandin closed 3 years ago

pgrandin commented 3 years ago

The following Dockerfile used to work with older stage3 images:

FROM gentoo/stage3:systemd-20210702

ARG BUILD_DATE

RUN emerge --sync
COPY ebuild-ipc /usr/lib/portage/python3.9/ebuild-ipc
RUN FEATURES="-pid-sandbox" emerge -q dev-vcs/git eix app-misc/jq

But with recent stage3 images, I am getting this error:

Step 5/8 : RUN emerge -q dev-vcs/git eix app-misc/jq
 ---> Running in fe58bf483575
>>> Verifying ebuild manifests

 * IMPORTANT: 6 news items need reading for repository 'gentoo'.
 * Use eselect news read to view new items.

>>> Emerging (1 of 20) dev-libs/oniguruma-6.9.7.1::gentoo
>>> Failed to emerge dev-libs/oniguruma-6.9.7.1, Log file:
>>>  '/var/tmp/portage/dev-libs/oniguruma-6.9.7.1/temp/build.log'
 * Package:    dev-libs/oniguruma-6.9.7.1
 * Repository: gentoo
 * Maintainer: arfrever.fta@gmail.com cjk@gentoo.org
 * USE:        abi_x86_64 amd64 elibc_glibc kernel_linux userland_GNU
 * FEATURES:   network-sandbox preserve-libs sandbox userpriv usersandbox
Unable to unshare: EPERM (for FEATURES="pid-sandbox")
File not found: /usr/lib/portage/python3.9/ebuild-ipc
 * The ebuild phase 'setup' has exited unexpectedly. This type of behavior
 * is known to be triggered by things such as failed variable assignments
 * (bug #190128) or bad substitution errors (bug #200313). Normally, before
 * exiting, bash should have displayed an error message above. If bash did
 * not produce an error message above, it's possible that the ebuild has
 * called `exit` when it should have called `die` instead. This behavior
 * may also be triggered by a corrupt bash binary or a hardware problem
 * such as memory or cpu malfunction. If the problem is not reproducible or
 * it appears to occur randomly, then it is likely to be triggered by a
 * hardware problem. If you suspect a hardware problem then you should try
 * some basic hardware diagnostics such as memtest. Please do not report
 * this as a bug unless it is consistently reproducible and you are sure
 * that your bash binary and hardware are functioning properly.
File not found: /usr/lib/portage/python3.9/ebuild-ipc
 * The ebuild phase 'die_hooks' has exited unexpectedly. This type of
 * behavior is known to be triggered by things such as failed variable
 * assignments (bug #190128) or bad substitution errors (bug #200313).
 * Normally, before exiting, bash should have displayed an error message
 * above. If bash did not produce an error message above, it's possible
 * that the ebuild has called `exit` when it should have called `die`
 * instead. This behavior may also be triggered by a corrupt bash binary or
 * a hardware problem such as memory or cpu malfunction. If the problem is
 * not reproducible or it appears to occur randomly, then it is likely to
 * be triggered by a hardware problem. If you suspect a hardware problem
 * then you should try some basic hardware diagnostics such as memtest.
 * Please do not report this as a bug unless it is consistently
 * reproducible and you are sure that your bash binary and hardware are
 * functioning properly.
 * Messages for package dev-libs/oniguruma-6.9.7.1:
 * The ebuild phase 'setup' has exited unexpectedly. This type of behavior
 * is known to be triggered by things such as failed variable assignments
 * (bug #190128) or bad substitution errors (bug #200313). Normally, before
 * exiting, bash should have displayed an error message above. If bash did
 * not produce an error message above, it's possible that the ebuild has
 * called `exit` when it should have called `die` instead. This behavior
 * may also be triggered by a corrupt bash binary or a hardware problem
 * such as memory or cpu malfunction. If the problem is not reproducible or
 * it appears to occur randomly, then it is likely to be triggered by a
 * hardware problem. If you suspect a hardware problem then you should try
 * some basic hardware diagnostics such as memtest. Please do not report
 * this as a bug unless it is consistently reproducible and you are sure
 * that your bash binary and hardware are functioning properly.

This also looks weird to me?

 # eselect python list
Available Python interpreters, in order of preference:
  [1]   python3.9 (uninstalled)

The file in question seems legit to me:

7789b003daaf / # ls -la /usr/lib/portage/python3.9/ebuild-ipc
-rwxr-xr-x 1 root root 608 Jun 13 21:26 /usr/lib/portage/python3.9/ebuild-ipc

7789b003daaf / # file /usr/lib/portage/python3.9/ebuild-ipc
/usr/lib/portage/python3.9/ebuild-ipc: Bourne-Again shell script, ASCII text executable

7789b003daaf / # cat /usr/lib/portage/python3.9/ebuild-ipc
#!/bin/bash
# Copyright 2010-2021 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2

export __PORTAGE_HELPER_CWD=${PWD}

if [[ ${0##*/} == "ebuild-pyhelper" ]]; then
        echo "ebuild-pyhelper: must be called via symlink" &>2
        exit 1
fi

# Use safe cwd, avoiding unsafe import for bug #469338.
cd "${PORTAGE_PYM_PATH}" || exit 1
for path in "${PORTAGE_BIN_PATH}/${0##*/}"{.py,}; do
        if [[ -x "${path}" ]]; then
                PYTHONPATH=${PORTAGE_PYTHONPATH:-${PORTAGE_PYM_PATH}} \
                        exec "${PORTAGE_PYTHON:-/usr/bin/python}" "${path}" "$@"
        fi
done
echo "File not found: ${path}" >&2
exit 1

Replacing the line if [[ -x "${path}" ]]; then with if [[ -e "${path}" ]]; then seems to work around the issue, but only when running it in a container, not when using docker build.

Given the file, it looks like it could be a portage bug triggered only in specific situations like here, when not having a tty?

Anybody else facing that issue?

Thanks!

pgrandin commented 3 years ago

Digging a little bit, I found this commit : https://github.com/gentoo/portage/commit/cec73041df583bfd46e1fa9739286a74a2e85b18

Testing the last image from february I cannot reproduce the issue. So I think that the above mentioned commit broke the stage3 images.

ultrabug commented 3 years ago

I guess this should be reported to the given developer, we on the container side have nothing to do with this right?

pgrandin commented 3 years ago

Well I think that it highlight a potentially missing test step. It looks to me like the images built since early March are unusable. Is anyone able to emerge stuff in recent images?

KSmanis commented 3 years ago

I do; I use these images downstream to package distcc: https://github.com/KSmanis/docker-gentoo-distcc

I am unfamiliar with ebuild-ipc however, so would you mind explaining why you have to COPY it in the image?

pgrandin commented 3 years ago

Thanks! The COPY thing was just a workaround. With images prior to the change I mentioned, I don't need it. To get images after this change to work I had to change the line I mentioned, but then it only works when using an interactive shell it seems.

pgrandin commented 3 years ago

@KSmanis i had a quick look at your repo, it looks like you are using latest, where i'm using the systemd variant. I've updated the title of this issue.

Update : still facing the issue with the following Dockerfile:

FROM gentoo/stage3:latest

ARG BUILD_DATE
LABEL org.label-schema.build-date=$BUILD_DATE

RUN emerge --sync
RUN FEATURES="-pid-sandbox" emerge -q dev-vcs/git eix app-misc/jq

Relevant image:

gentoo/stage3              latest              43cc50051fb9        18 hours ago        883MB
KSmanis commented 3 years ago

I successfully built the provided Dockerfile on two different Docker hosts: a (mostly stable) Gentoo host and an Ubuntu one (18.04 LTS). Could you try building the Dockerfile elsewhere? It could just be that you host (kernel) config is off.

KSmanis commented 3 years ago

Also, any further info on your setup would be useful, e.g., dockerd version, whether you build with buildkit/buildx, etc.

pgrandin commented 3 years ago

Thanks @KSmanis I think that you just solved my issue. My local dev machine was running app-emulation/docker-19.03.15. Bumping it to app-emulation/docker-20.10.7 seems to have fixed the issue. My images are built using CircleCI, and I was using the ubuntu-2004:202010-01 machine image, which ships with docker-19 as well. Bumping this one to ubuntu-2004:202104-01 which includes docker-20 seems to have fixed the issue.