pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
https://pymupdf.readthedocs.io
GNU Affero General Public License v3.0
4.49k stars 443 forks source link

Build for linux/arm64 fails #3622

Open comfuture opened 5 days ago

comfuture commented 5 days ago

Description of the bug

Problem

Installing PyMuPDF on linux/arm64 fails while install dependencies on linux/arm64 docker build. To make multiarch image, I've made a github workflow that builds the image for both amd64 and arm64. The build for amd64 is successful, but the build for arm64 stuck with the following error:

https://github.com/comfuture/glados/actions/runs/9674016697/job/26688926160#step:8:848

#16 37.91   Downloading PyMuPDF-1.24.6.tar.gz (30.4 MB)
#16 39.94      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 30.4/30.4 MB 14.2 MB/s eta 0:00:00
#16 41.66   Getting requirements to build wheel: started
#16 42.61   Getting requirements to build wheel: finished with status 'done'
#16 42.63   Installing backend dependencies: started
#16 57.14   Installing backend dependencies: finished with status 'done'
#16 57.14   Preparing metadata (pyproject.toml): started
#16 117.9   Preparing metadata (pyproject.toml): still running...
            ... (Repeats forever) ...
            ... (Repeats forever) ...

How to reproduce the bug

The project has files as follows:

├──.github
│  └── workflows
│     └── deploy.yaml
├── pymupdf_on_arm64
│   └── __init__.py (empty)
├── main.py
├── Dockerfile
└── pyproject.toml

pyproject.toml is as follows:

[build-system]
requires = ["flit_core >=3.9,<4"]
build-backend = "flit_core.buildapi"

[project]
name = "pymupdf_on_arm64"
requires-python = ">=3.10"

dependencies = [
  "PyMuPDF>=1.24.6",
]

Dockerfile is as follows:

FROM python:3.11

ADD . /app
WORKDIR /app

ENV FLIT_ROOT_INSTALL=1
RUN pip install flit
RUN flit install -s

CMD [ "python", "main.py" ]

The workflow deploy.yaml is as follows:

name: deploy
on:
  push:
    branches:
      - main
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Set up QEMU
        uses: docker/setup-qemu-action@v3
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
      - name: build and push
        uses: docker/build-push-action@v5.2.0
        with:
          tags: ghcr.io/${{ github.repository }}:${{ github.ref_name }},ghcr.io/${{ github.repository }}:latest
          context: .
          file: ./Dockerfile
          platforms: linux/amd64,linux/arm64
          push: false
          outputs: type=image,name=target,annotation-index.org.opencontainers.image.description=PyMuPDF Reproduce build error multi-arch image

Then push the project to the repository, and the workflow will be triggered.

PyMuPDF version

1.24.6

Operating system

Linux

Python version

3.11

julian-smith-artifex-com commented 4 days ago

I'm not sure we should assume the build is broken after just [20m] 40m; ARM builds on Github often run very slowly - i've seen PyMuPDF ARM builds take several hours. I think Github may be using an emulator.

Suggest you let it run until Github times out (6h i think).