jamesturk / jellyfish

🪼 a python library for doing approximate and phonetic matching of strings.
https://jamesturk.github.io/jellyfish/
MIT License
2.04k stars 157 forks source link

Wheel builds for musllinux and more architectures #181

Closed MartinoMensio closed 1 year ago

MartinoMensio commented 1 year ago

Hi @jamesturk , Thanks for this very useful library!

This PR addresses the build of wheels with maturin.

The main changes are:

1 musllinux platform added

Some linux distros (e.g. alpine linux) don't use glibc but use musl. I followed the PEP-0656 and targeted musllinux. Maturin provides support for this platform, so just had to configure the build of the wheels. Without these wheels, it may take half an hour to install jellyfish with pip from sdist.

2 interpreter selection in maturin

Instead of using setup-python and then --find-interpreter, if we use the explicit -i argument, we can list the interpreters for the build. This has 2 main advantages:

  1. redundant builds: if you look the logs for the step Build wheels in your master branch (e.g. latest stable release), you will see that multiple python interpreter are found each time, and the matrix strategy iteration is repeating itself many builds of the same wheel.
  2. reduction of the number of matrix strategy combinations: by listing all the interpreters at once, the cases are only by platform and not by interpreter.
  3. deterministic list of built wheels: the --find-interpreter leads to compiled versions that are not declared anywhere, or to missing wheels (see last list below)

3 setup-python action not necessary to build wheels

Since maturin relies on virtualization to build the wheels, there is no need to setup python. The interpreter are declared and used inside virtualization.

Results

The results are that the same wheels are still being built (compare artifacts generated by workflow):

jellyfish-0.11.0-cp310-cp310-macosx_10_7_x86_64.whl
jellyfish-0.11.0-cp310-cp310-macosx_11_0_arm64.whl
jellyfish-0.11.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-cp310-none-win32.whl
jellyfish-0.11.0-cp310-none-win_amd64.whl
jellyfish-0.11.0-cp311-cp311-macosx_10_7_x86_64.whl
jellyfish-0.11.0-cp311-cp311-macosx_11_0_arm64.whl
jellyfish-0.11.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-cp311-none-win32.whl
jellyfish-0.11.0-cp311-none-win_amd64.whl
jellyfish-0.11.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-cp37-none-win32.whl
jellyfish-0.11.0-cp37-none-win_amd64.whl
jellyfish-0.11.0-cp38-cp38-macosx_10_7_x86_64.whl
jellyfish-0.11.0-cp38-cp38-macosx_11_0_arm64.whl
jellyfish-0.11.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-cp38-none-win32.whl
jellyfish-0.11.0-cp38-none-win_amd64.whl
jellyfish-0.11.0-cp39-cp39-macosx_10_7_x86_64.whl
jellyfish-0.11.0-cp39-cp39-macosx_11_0_arm64.whl
jellyfish-0.11.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-cp39-none-win32.whl
jellyfish-0.11.0-cp39-none-win_amd64.whl
jellyfish-0.11.0-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-pp39-pypy39_pp73-macosx_10_7_x86_64.whl
jellyfish-0.11.0-pp39-pypy39_pp73-macosx_11_0_arm64.whl
jellyfish-0.11.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
jellyfish-0.11.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

In addition, the following wheels are being generated for musllinux:

jellyfish-0.11.0-cp310-cp310-musllinux_1_1_i686.whl
jellyfish-0.11.0-cp310-cp310-musllinux_1_1_x86_64.whl
jellyfish-0.11.0-cp311-cp311-musllinux_1_1_i686.whl
jellyfish-0.11.0-cp311-cp311-musllinux_1_1_x86_64.whl
jellyfish-0.11.0-cp37-cp37m-musllinux_1_1_i686.whl
jellyfish-0.11.0-cp37-cp37m-musllinux_1_1_x86_64.whl
jellyfish-0.11.0-cp38-cp38-musllinux_1_1_i686.whl
jellyfish-0.11.0-cp38-cp38-musllinux_1_1_x86_64.whl
jellyfish-0.11.0-cp39-cp39-musllinux_1_1_i686.whl
jellyfish-0.11.0-cp39-cp39-musllinux_1_1_x86_64.whl
jellyfish-0.11.0-pp38-pypy38_pp73-musllinux_1_1_i686.whl
jellyfish-0.11.0-pp38-pypy38_pp73-musllinux_1_1_x86_64.whl
jellyfish-0.11.0-pp39-pypy39_pp73-musllinux_1_1_i686.whl
jellyfish-0.11.0-pp39-pypy39_pp73-musllinux_1_1_x86_64.whl

And the following ones are generated, that were lost by the --find-interpreter parameter.

jellyfish-0.11.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
jellyfish-0.11.0-cp37-cp37m-macosx_10_7_x86_64.whl
jellyfish-0.11.0-cp37-cp37m-macosx_11_0_arm64.whl
jellyfish-0.11.0-pp38-pypy38_pp73-macosx_10_7_x86_64.whl
jellyfish-0.11.0-pp38-pypy38_pp73-macosx_11_0_arm64.whl

I tested some of the wheels in docker, and it seems to work on alpine linux without problems. Maybe a step to test all the generated wheels should be done before releasing to pypi. I'

Feel free to further edit or comment!

Best, Martino

jamesturk commented 1 year ago

Thanks for this! Very much appreciate the help improving this process.

MartinoMensio commented 1 year ago

You're very welcome James!