tonistiigi / binfmt

Cross-platform emulator collection distributed with Docker images.
MIT License
955 stars 70 forks source link

Building Docker Fedora ARM v7 images fails as 'Out of memory allocating XXX bytes' #109

Closed abraunegg closed 1 year ago

abraunegg commented 2 years ago

Behaviour

When building a Docker Fedora ARMv7 image, this has been failing for the last 2 months, with the following error:

#19 263.9 Out of memory allocating 536870912 bytes!

This appears to be linked to when a new image was released qemu-v7.0.0-28

If I force to use the prior version of qemu qemu-v6.2.0-26 there is no issue and the images are built correctly and in a timely fashion.

- uses: docker/setup-qemu-action@v2
        with:
          image: tonistiigi/binfmt:qemu-v6.2.0-26
          platforms: all

This issue only impacts Fedora ARM v7 builds, other platforms such as Debian | Ubuntu are not impacted.

Steps to reproduce this issue

Build an image for Fedora ARM v7 within a 'matrix'.

- uses: docker/setup-qemu-action@v2
        with:
          image: tonistiigi/binfmt:latest
          platforms: all

Expected behaviour

Builds for Fedora ARMv7 should build without issue

Actual behaviour

Fedora ARMv7 builds fail with Out of memory allocating XXX bytes

Configuration

name: Build Docker Images

on:
  push:
    branches: [ master ]
    tags: [ 'v*' ]
  pull_request:
    branches:
      - master
    types: [closed]

env:
  DOCKER_HUB_SLUG: driveone/onedrive

jobs:
  build:
    if: (!(github.event.action == 'closed' && github.event.pull_request.merged != true))
    runs-on: ubuntu-latest

    strategy:
      matrix:
        flavor: [ fedora, debian, alpine ]
        include:
          - flavor: fedora
            dockerfile: ./contrib/docker/Dockerfile
            platforms: linux/amd64,linux/arm64
          - flavor: debian
            dockerfile: ./contrib/docker/Dockerfile-debian
            platforms: linux/amd64,linux/arm64,linux/arm/v7
          - flavor: alpine
            dockerfile: ./contrib/docker/Dockerfile-alpine
            platforms: linux/amd64,linux/arm64

    steps:
      - name: Check out code from GitHub
        uses: actions/checkout@v3
        with:
          submodules: recursive
          fetch-depth: 0

      - name: Docker meta
        id: docker_meta
        uses: marcelcoding/ghaction-docker-meta@v2
        with:
          tag-edge: true
          images: |
            ${{ env.DOCKER_HUB_SLUG }}
          tag-semver: |
            {{version}}
            {{major}}.{{minor}}
          flavor: ${{ matrix.flavor }}
          main-flavor: ${{ matrix.flavor == 'fedora' }}

      - uses: docker/setup-qemu-action@v2
        if: matrix.platforms != 'linux/amd64'

      - uses: docker/setup-buildx-action@v2

      - name: Cache Docker layers
        uses: actions/cache@v3
        with:
          path: /tmp/.buildx-cache
          key: ${{ runner.os }}-buildx-${{ matrix.flavor }}-${{ github.sha }}
          restore-keys: |
            ${{ runner.os }}-buildx-${{ matrix.flavor }}

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        if: github.event_name != 'pull_request'
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_HUB_ACCESS_TOKEN }}

      - name: Build and Push to Docker
        uses: docker/build-push-action@v3
        with:
          context: .
          file: ${{ matrix.dockerfile }}
          platforms: ${{ matrix.platforms }}
          push: ${{ github.event_name != 'pull_request' }}
          tags: ${{ steps.docker_meta.outputs.tags }}
          labels: ${{ steps.docker_meta.outputs.labels }}
          cache-from: type=local,src=/tmp/.buildx-cache
          cache-to: type=local,dest=/tmp/.buildx-cache-new

      - name: Move cache
        run: |
          rm -rf /tmp/.buildx-cache
          mv /tmp/.buildx-cache-new /tmp/.buildx-cache

Logs

Failed build logs_1109.zip

Working build with downgraded qemu

logs_1132.zip

crazy-max commented 2 years ago

Can you post a repro with a simple Dockerfile and build command used please?

abraunegg commented 2 years ago

@crazy-max A simple repro has been provided ... Unsure what else you need?

This should be fairly easy to reproduce - clone the 'onedrive' repository, use the Docker file, use the above github action script and you should be able to reproduce this consistently.

crazy-max commented 2 years ago

A simple repro has been provided ... Unsure what else you need?

Posting a link to your repo is not a "simple repro" per se. We have to dig in your repository and logs to find out what Dockerfile and command is used. We are not aware how your project works. It's also easier for other people to track this kind of issue in the future if the Dockerfile and command being used is posted directly here as your repo may be removed in the future.

So in:

# -*-Dockerfile-*-

ARG FEDORA_VERSION=36
ARG DEBIAN_VERSION=bullseye
ARG GO_VERSION=1.17
ARG GOSU_VERSION=1.14

FROM golang:${GO_VERSION}-${DEBIAN_VERSION} AS builder-gosu
ARG GOSU_VERSION
RUN go install -ldflags "-s -w" github.com/tianon/gosu@${GOSU_VERSION}

FROM fedora:${FEDORA_VERSION} AS builder-onedrive

RUN dnf install -y ldc pkgconf libcurl-devel sqlite-devel git

ENV PKG_CONFIG=/usr/bin/pkgconf

COPY . /usr/src/onedrive
WORKDIR /usr/src/onedrive

RUN ./configure \
 && make clean \
 && make \
 && make install

FROM fedora:${FEDORA_VERSION}

RUN dnf install -y libcurl sqlite ldc-libs \
 && dnf clean all \
 && mkdir -p /onedrive/conf /onedrive/data

COPY --from=builder-gosu /go/bin/gosu /usr/local/bin/
COPY --from=builder-onedrive /usr/local/bin/onedrive /usr/local/bin/

COPY contrib/docker/entrypoint.sh /
RUN chmod +x /entrypoint.sh

ENTRYPOINT ["/entrypoint.sh"]

This instruction looks to hang:

RUN ./configure \
 && make clean \
 && make \
 && make install

with docker buildx build -f ./contrib/docker/Dockerfile --platform linux/arm/v7 ..

Will take a look to repro locally and check if it works with https://github.com/tonistiigi/binfmt/pull/110. But looking at your Dockerfile I would say it would be better to cross-comp.

crazy-max commented 2 years ago

Ok able to repro:

$ docker buildx build -f ./contrib/docker/Dockerfile --platform linux/arm/v7 .
...
#11 [builder-onedrive 2/5] RUN dnf install -y ldc pkgconf libcurl-devel sqlite-devel git
#11 CANCELED
------
 > [stage-2 2/6] RUN dnf install -y libcurl sqlite ldc-libs  && dnf clean all  && mkdir -p /onedrive/conf /onedrive/data:
#10 22.92 Fedora 36 - armhfp                              3.6 MB/s |  76 MB     00:21
#10 105.1 Fedora 36 openh264 (From Cisco) - armhfp        850  B/s | 2.5 kB     00:03
#10 108.1 Fedora Modular 36 - armhfp                      1.6 MB/s | 2.3 MB     00:01
#10 116.7 Fedora 36 - armhfp - Updates                    5.0 MB/s |  26 MB     00:05
#10 152.2 Out of memory allocating 503316480 bytes!
#10 152.2 qemu: uncaught target signal 6 (Aborted) - core dumped

Seems to happen when builder-onedrive and last stage are doing dnf install in //.

Moving RUN dnf install for the last stage before ENTRYPOINT seems to solve the issue:

...

FROM fedora:${FEDORA_VERSION}
COPY --from=builder-gosu /go/bin/gosu /usr/local/bin/
COPY --from=builder-onedrive /usr/local/bin/onedrive /usr/local/bin/

COPY contrib/docker/entrypoint.sh /
RUN chmod +x /entrypoint.sh

RUN dnf install -y libcurl sqlite ldc-libs \
 && dnf clean all \
 && mkdir -p /onedrive/conf /onedrive/data

ENTRYPOINT ["/entrypoint.sh"]
#11 [builder-gosu 2/2] RUN go install -ldflags "-s -w" github.com/tianon/gosu@1.14
#11 1.320 go: downloading github.com/tianon/gosu v0.0.0-20210817173139-9f7cd138a1eb
#11 157.7 go: downloading github.com/opencontainers/runc v1.0.1
#11 194.6 go: downloading golang.org/x/sys v0.0.0-20210817142637-7d9622a276b7
#11 ...

#14 [builder-onedrive 5/5] RUN ./configure  && make clean  && make  && make install
#14 287.6 /usr/bin/install -c -D onedrive /usr/local/bin/onedrive
#14 287.6 /usr/bin/install -c -D -m 0644 onedrive.1 /usr/local/share/man/man1/onedrive.1
#14 287.6 /usr/bin/install -c -D -m 0644 contrib/logrotate/onedrive.logrotate /usr/local/etc/logrotate.d/onedrive
#14 287.6 mkdir -p /usr/local/share/doc/onedrive
#14 287.7 /usr/bin/install -c -D -m 0644 README.md config LICENSE CHANGELOG.md docs/Docker.md docs/INSTALL.md docs/SharePoint-Shared-Libraries.md docs/USAGE.md docs/BusinessSharedFolders.md docs/advanced-usage.md docs/application-security.md /usr/local/share/doc/onedrive
#14 DONE 287.7s

#11 [builder-gosu 2/2] RUN go install -ldflags "-s -w" github.com/tianon/gosu@1.14
#11 DONE 556.3s

#15 [stage-2 2/6] COPY --from=builder-gosu /go/bin/gosu /usr/local/bin/
#15 DONE 0.1s

#16 [stage-2 3/6] COPY --from=builder-onedrive /usr/local/bin/onedrive /usr/local/bin/
#16 DONE 0.1s

#17 [stage-2 4/6] COPY contrib/docker/entrypoint.sh /
#17 DONE 0.1s

#18 [stage-2 5/6] RUN chmod +x /entrypoint.sh
#18 DONE 0.2s

#19 [stage-2 6/6] RUN dnf install -y libcurl sqlite ldc-libs  && dnf clean all  && mkdir -p /onedrive/conf /onedrive/data
#19 14.96 Fedora 36 - armhfp                              5.6 MB/s |  76 MB     00:13    
#19 90.07 Fedora 36 openh264 (From Cisco) - armhfp        843  B/s | 2.5 kB     00:03    
#19 92.97 Fedora Modular 36 - armhfp                      1.6 MB/s | 2.3 MB     00:01    
#19 100.8 Fedora 36 - armhfp - Updates                    5.5 MB/s |  26 MB     00:04    
#19 135.6 Fedora Modular 36 - armhfp - Updates            1.6 MB/s | 2.8 MB     00:01    
#19 140.7 Last metadata expiration check: 0:00:01 ago on Sat Oct  8 23:35:05 2022.
#19 152.3 Package libcurl-7.82.0-2.fc36.armv7hl is already installed.
#19 152.8 Dependencies resolved.
#19 152.8 ================================================================================
#19 152.8  Package          Architecture    Version                 Repository       Size
#19 152.8 ================================================================================
#19 152.8 Installing:
#19 152.8  ldc-libs         armv7hl         1:1.27.1-3.fc36         updates         2.1 M
#19 152.8  sqlite           armv7hl         3.36.0-5.fc36           fedora          721 k
#19 152.8
#19 152.8 Transaction Summary
#19 152.8 ================================================================================
#19 152.8 Install  2 Packages
#19 152.8
#19 152.8 Total download size: 2.8 M
#19 152.8 Installed size: 13 M
#19 152.8 Downloading Packages:
#19 154.7 (1/2): sqlite-3.36.0-5.fc36.armv7hl.rpm         2.8 MB/s | 721 kB     00:00    
#19 154.8 (2/2): ldc-libs-1.27.1-3.fc36.armv7hl.rpm       6.1 MB/s | 2.1 MB     00:00
#19 154.8 --------------------------------------------------------------------------------
#19 154.8 Total                                           1.4 MB/s | 2.8 MB     00:01
#19 155.2 Running transaction check
#19 155.3 Transaction check succeeded.
#19 155.3 Running transaction test
#19 155.4 Transaction test succeeded.
#19 155.4 Running transaction
#19 155.5   Preparing        :                                                        1/1 
#19 155.8   Installing       : ldc-libs-1:1.27.1-3.fc36.armv7hl                       1/2
#19 155.9   Installing       : sqlite-3.36.0-5.fc36.armv7hl                           2/2
#19 156.0   Running scriptlet: sqlite-3.36.0-5.fc36.armv7hl                           2/2 
#19 156.2   Verifying        : sqlite-3.36.0-5.fc36.armv7hl                           1/2 
#19 156.2   Verifying        : ldc-libs-1:1.27.1-3.fc36.armv7hl                       2/2
#19 156.3
#19 156.3 Installed:
#19 156.3   ldc-libs-1:1.27.1-3.fc36.armv7hl         sqlite-3.36.0-5.fc36.armv7hl
#19 156.3
#19 156.3 Complete!
#19 157.6 42 files removed
#19 DONE 157.7s

It would be better to improve your build by using xx cross-compilation helpers and this copr package: https://copr.fedorainfracloud.org/coprs/lantw44/arm-linux-gnueabihf-toolchain/.

abraunegg commented 2 years ago

@crazy-max

Moving RUN dnf install for the last stage before ENTRYPOINT seems to solve the issue:

Thanks for the suggestion - ive tried this via https://github.com/abraunegg/onedrive/pull/2179 - however the build still fails for ARMv7:

image

#41 [linux/arm/v7 stage-2 6/6] RUN dnf install -y libcurl sqlite ldc-libs  && dnf clean all  && mkdir -p /onedrive/conf /onedrive/data
#41 26.13 Fedora 36 - armhfp                              3.1 MB/s |  76 MB     00:24    
#41 148.9 Fedora 36 openh264 (From Cisco) - armhfp        804  B/s | 2.5 kB     00:03    
#41 153.5 Fedora Modular 36 - armhfp                      1.0 MB/s | 2.3 MB     00:02    
#41 164.7 Fedora 36 - armhfp - Updates                    4.0 MB/s |  26 MB     00:06    
#41 214.8 Out of memory allocating 603979776 bytes!
#41 ERROR: process "/bin/sh -c dnf install -y libcurl sqlite ldc-libs  && dnf clean all  && mkdir -p /onedrive/conf /onedrive/data" did not complete successfully: exit code: 134

#21 [linux/arm64 builder-onedrive 2/5] RUN dnf install -y ldc pkgconf libcurl-devel sqlite-devel git
#21 CANCELED
------
 > [linux/arm/v7 stage-2 6/6] RUN dnf install -y libcurl sqlite ldc-libs  && dnf clean all  && mkdir -p /onedrive/conf /onedrive/data:
#41 26.13 Fedora 36 - armhfp                              3.1 MB/s |  76 MB     00:24    
#41 148.9 Fedora 36 openh264 (From Cisco) - armhfp        804  B/s | 2.5 kB     00:03    
#41 153.5 Fedora Modular 36 - armhfp                      1.0 MB/s | 2.3 MB     00:02    
#41 164.7 Fedora 36 - armhfp - Updates                    4.0 MB/s |  26 MB     00:06    
#41 214.8 Out of memory allocating 603979776 bytes!

logs_1159.zip

It would be better to improve your build by using xx cross-compilation helpers and this copr package

Will look into this - however - the building of these Docker containers used to work flawlessly ..

abraunegg commented 2 years ago

@crazy-max

It would be better to improve your build by using xx cross-compilation helpers and this copr package

Reading through https://github.com/tonistiigi/xx/ - using xx is not going to be possible as DMD & LDC (the compiler) does not support cross compiling. It has specific install & binaries per architecture - so utilising xx is not going to be possible.

crazy-max commented 2 years ago

Out of memory allocating with dnf on armhf (32-bit) seems to have some issues without using any emulation: https://stackoverflow.com/questions/45605806/out-of-memory-allocating-bytes

Can you try with this Dockerfile using yasu instead of gosu so you skip a compilation step?:

ARG FEDORA_VERSION=36
ARG DEBIAN_VERSION=bullseye
ARG GO_VERSION=1.17
ARG YASU_VERSION=1.19.0

FROM crazymax/yasu:${YASU_VERSION} AS yasu

FROM fedora:${FEDORA_VERSION} AS builder-onedrive
RUN dnf install -y ldc pkgconf libcurl-devel sqlite-devel git

ENV PKG_CONFIG=/usr/bin/pkgconf
COPY . /usr/src/onedrive
WORKDIR /usr/src/onedrive

RUN ./configure \
 && make clean \
 && make \
 && make install

FROM fedora:${FEDORA_VERSION}
COPY --from=yasu /usr/local/bin/yasu /usr/local/bin/gosu
COPY --from=builder-onedrive /usr/local/bin/onedrive /usr/local/bin/

RUN dnf install -y libcurl sqlite ldc-libs \
 && dnf clean all \
 && mkdir -p /onedrive/conf /onedrive/data

COPY contrib/docker/entrypoint.sh /
RUN chmod +x /entrypoint.sh

ENTRYPOINT ["/entrypoint.sh"]

Can you also try with #110?:

$ docker run --privileged --rm crazymax/binfmt:v7.1.0 --uninstall qemu-*
$ docker run --privileged --rm crazymax/binfmt:v7.1.0 --install all
abraunegg commented 2 years ago

@crazy-max

Can you try with this Dockerfile using yasu instead of gosu so you skip a compilation step?

image

Same issue using 'yasu'

https://github.com/abraunegg/onedrive/actions/runs/3238259588/jobs/5306283902

abraunegg commented 1 year ago

@crazy-max From: https://github.com/docker/setup-qemu-action/issues/60#issuecomment-1301777552

@abraunegg We have identified some issues with QEMU 7.0.0 on our side too.

What issues did you identify on your side? Can you elaborate on this at all?

... Can you try with

  image: tonistiigi/binfmt:master@sha256:c6fd7794c689f9144fddd93cb8cbc39a141aec5aa853df0849315e8f62a0dfa2 

This tag is based on QEMU 7.1.0 (see tonistiigi/binfmt#110).

Currently building with this specific tag. Will advise post run.

abraunegg commented 1 year ago

@crazy-max Unfortunately the build fails - either with the original gosu or even the updated option you suggested of using 'gosu' from 'yasu':

image

Error:

image

pexcn commented 1 year ago

I tried it, unfortunately still build failed. 😥

abraunegg commented 1 year ago

@crazy-max Any update or further suggestions here ?

pexcn commented 1 year ago

Any updates? 🥲

crazy-max commented 1 year ago

You can switch to tonistiigi/binfmt:qemu-v6.2.0 in the meantime.

pexcn commented 1 year ago

You can switch to tonistiigi/binfmt:qemu-v6.2.0 in the meantime.

I have tried, but failed, see: https://github.com/pexcn/docker-images/actions/runs/3351441314/jobs/5552892161

crazy-max commented 1 year ago

@pexcn I don't see anything related to the original issue "Building Docker Fedora ARM v7 images fails as 'Out of memory allocating XXX bytes'" in the logs: https://github.com/pexcn/docker-images/actions/runs/3391825550/jobs/5637357880. Also does not look linked to QEMU issue. Please open a new issue with logs and repro.

pexcn commented 1 year ago

@pexcn I don't see anything related to the original issue "Building Docker Fedora ARM v7 images fails as 'Out of memory allocating XXX bytes'" in the logs: https://github.com/pexcn/docker-images/actions/runs/3391825550/jobs/5637357880. Also does not look linked to QEMU issue. Please open a new issue with logs and repro.

But it returns exit code: 137, means to Out of Memory?

crazy-max commented 1 year ago

@pexcn Yes looks like it but with cargo. Can you open another issue for this one? Thanks.

@abraunegg Closing this issue as it seems fedora dropped ARMv7 support.