freedomofpress / dangerzone

Take potentially dangerous PDFs, office documents, or images and convert them to safe PDFs
https://dangerzone.rocks/
GNU Affero General Public License v3.0
3.35k stars 152 forks source link

Install PyMuPDF via the prebuilt Python wheels for musl #850

Open apyrgio opened 6 days ago

apyrgio commented 6 days ago

Starting from PyMuPDF 1.24.6, the developers have added support for prebuilt musl-based wheels:

Add musllinux x86_64 wheels to release.

This change can significantly improve our Dockerfile, and specifically these two lines:

https://github.com/freedomofpress/dangerzone/blob/e7e3430ca11ef768485b2d27e069ea76e9923fec/Dockerfile#L10-L11

Instead of installing build tools (g++, make, etc.) and building PyMuPDF from source, pip install will install the necessary wheels. This will greatly improve the build times of our container image.

There's one problem though: There is still no support for prebuilt must-based wheels for aarch64 architectures. Since our Dockerfile is the same between our two architectures, we can't easily do different actions per architecture. For example, the pre-built wheels also create the following path: /usr/lib/python3.12/site-packages/PyMuPDFb.libs/. The regular wheels do not include this path, and therefore our Dockerfile would have to conditionally copy it, depending on the architecture:

https://github.com/freedomofpress/dangerzone/blob/e7e3430ca11ef768485b2d27e069ea76e9923fec/Dockerfile#L64-L65

This is still an open problem, and we have to wait until the PyMuPDF devs have a solution for it.