wlav / cppyy-backend

23 stars 27 forks source link

Add GitHub action to build wheels with cibuildwheel #8

Closed finsberg closed 1 year ago

finsberg commented 1 year ago

In this PR I have created a GitHub action to build wheels for cling. Currently the following OS + Architectures build successfully

The following combinations failed due to timeout after 6 hours (this is apparently the limit https://docs.github.com/en/actions/learn-github-actions/usage-limits-billing-and-administration#usage-limits).

The reason these are so slow it because they are using emulation through QEMU. Another approach would be to use a native aarch64 runner (but I am not sure if this is available through GitHub Actions).

The follwing combinations failed due to some other reason

You can see the full output here: https://github.com/finsberg/cppyy-backend/actions/runs/4084705242

A few things worth noting

  1. Before each build I run python create_src_directory.py (specified in the pyproject.toml), and in some cases this failed. I didn't really investigate this, but as a temporary solution I commented out the call to sys.exit in order for the workflow to continue.
  2. I could not get auditwheel repair to work, see https://github.com/finsberg/cppyy-backend/actions/runs/4048282754/jobs/6963318442#step:8:4430 for more info. As a result, I did disable auditwheel, by setting tool.cibuildwheel.linux.repair-wheel-command = "" in pyproject.toml.
  3. After each build I ran a simple test to just check that the library is installed correctly by simply running python -c "import cppyy_backend". We might want to add a proper test to make sure some core functionality is working as expected.
  4. I was using the workflow from pyzmq (https://github.com/zeromq/pyzmq/blob/main/.github/workflows/wheels.yml) as a guide. We could potentially get some inspiration from their setup scripts in order to better streamline this workflow.
  5. Doing this for clingwrapper should be very similar.
  6. If you want to accept this then it would probably be a good to add a finaly step for uploading the wheels to pypi using https://github.com/marketplace/actions/pypi-publish than can run every time you make a release.

This PR is related to https://github.com/wlav/cppyy/issues/123

wlav commented 1 year ago

Thank you! I never even tried for cppyy-cling b/c it builds LLVM, which is time-consuming (although I'm truly surprised by it taking more than 6hrs, even when emulated). For conda, they use a stock LLVM, but I have some patches to it, so prefer to make it part of the distribution. I always assumed that building cppyy-cling would run way over monthly usage limits.

create_src_directory.py is what pulls in the Cling/Clang/LLVM sources and patches them. If it fails, but the compilation itself succeeds, then this may be due to the patching failing. On Linux/Mac, it expects that the patch utility is available, on Windows it expects that the patch Python package is installed. Is "some cases" specific to one platform, or is it random?

For auditwheel, I'm not sure how it's trying to locate libThreadLegacy,so. If it's doing a search, it should be able to find it, but if it relies on it being loadable e.g. by dlopen, I can see that failing. I can try it offline.

One way to further reduce the time needed is to not run for more than 1 python version per platform. Both cppyy-cling and cppyy-backend are independent from Python (although cppyy-cling needs some python executable present to complete the build as there are some Python scripts as part of the build process). Only CPyCppyy is Python-dependent (and not used in the case of pypy, which has a builtin _cppyy module); the cppyy top-level package is again Python-independent (b/c is pure Python).

finsberg commented 1 year ago

Thanks for your feedback.

I tried to put the sys.exit back in create_src_directory.py and it seems to actually work fine now. Don't know why this was an issue before.

However it seems to be a new problem after I merged the master branch into my branch, see e.g https://github.com/finsberg/cppyy-backend/actions/runs/4101851124/jobs/7074102266#step:8:2624

Regarding auditwheel, I did try to run this offline, but I was not able to find a solution but I belive you would be better suited to debug this. You could for example use the docker image specified here: https://github.com/finsberg/cppyy-backend/actions/runs/4108080889/jobs/7088343480#step:8:42

Also we are currently not running any repair command for the windows wheels. One option would be to use https://pypi.org/project/delvewheel/

Regarding python versions, I am actually only building one python version per combination of OS and architecture.

wlav commented 1 year ago

The new problem is most likely my switch to --std=c++20 as the default. Probably the compiler used does not support it and cmake falls back to C++11. That error is complaining about a C++14 feature. There are other issues with using C++20 to build LLVM13, so I'll be reverting that.

finsberg commented 1 year ago

@wlav I managed to build aarch64 wheels using CircleCI here: https://app.circleci.com/pipelines/github/finsberg/cppyy-backend/37/workflows/216ea4ea-dd38-4c4b-832d-2ce56b379771 It looks like running MacOS in CircleCI requires a payed account, so one option would be to build aarch64 wheels on CircleCI and the rest on GitHub actions and then manually downloading these wheels and uploading them to pypi when you have a new release. Alternatively, it might be possible to fetch the built wheels from GitHub actions, but I suspect that this might be a bit tricky.

What do you think?

stuaxo commented 1 year ago

You might be able to do something from the circleci end on a successful build - in the post below they are publishing to a github release, for instance

https://circleci.com/blog/publishing-to-github-releases-via-circleci/

wlav commented 1 year ago

The switch to C++20 by default is now done. Or at least, code now compiles and works for me on Linux, MacOS, and Windows.

finsberg commented 1 year ago

@wlav I have now finally managed to get this to work. I am now building the aarch64 wheel on CircleCI and the rest of the wheels on GitHub Actions. You can check out this run and see that all the wheels are successfully uploaded as artifacts.

The way it works is that you need to first trigger a pipeline on CircleCI from GitHub actions and then make sure to get the job number using the CircleCI Rest API. This will start the pipeline on CircleCI. At the same time we start building the wheels on GitHub action and once that is done we assume that the pipeline on CircleCI is also done. Then we create a new job that fetches the wheel from Circle CI

I have created a separate python script both for triggering the pipeline and for fetching the wheel.

What remains is to add a new job for uploading the wheels to pypi, but for this we could use https://github.com/marketplace/actions/pypi-publish. You can also check out this pipeline as an example.

Also, in order to trigger the pipeline you need to set up a token with admin rights (the different scopes are "Status", "Read only" and "Admin" and we need to do more than "Read"), and save this as a repository secret called CIRCLE_API_TOKEN.

You also want to change the default argument org in the python script that interacts with CircleCI to your own username.

Final note is that I have turned off the repair wheel command on linux and windows. I am not sure how important this is, but it might also be a good idea to add some better testing of the built wheels. Again, you can check out a more advanced setup here

wlav commented 1 year ago

I'm slowly working through the details, thanks for all the work!

For auditwheel, if I try it interactively, I get Error: This tool only supports Linux.

finsberg commented 1 year ago

@wlav The reason it is failing here: https://github.com/wlav/cppyy-backend/actions/runs/4422767181/jobs/7757608629 is becasue you need to set a secret with the name CIRCLE_API_TOKEN see screenshot. You can read more about how to set this up here: https://circleci.com/docs/managing-api-tokens/

Screenshot 2023-03-15 at 11 20 57
wlav commented 1 year ago

I've set the CIRCLE_API_TOKEN several times (regenerated, set again), but it never shows up in the job: it always shows an empty wheel_path b/c the token value is empty: https://github.com/wlav/cppyy-backend/actions/runs/6128239126/job/16636370448

I've completely run out of ideas how else to fix this ... do you know anything else that it could be?

finsberg commented 1 year ago

I've set the CIRCLE_API_TOKEN several times (regenerated, set again), but it never shows up in the job: it always shows an empty wheel_path b/c the token value is empty: https://github.com/wlav/cppyy-backend/actions/runs/6128239126/job/16636370448

I've completely run out of ideas how else to fix this ... do you know anything else that it could be?

That is strange. Did make sure to set up the project on CIRCLECI? In my dashboard it looks as follows:

Screenshot 2023-09-11 at 14 00 19

Also, you need to use the Personal API token , i.e under User Settings, not Project Settings.

I tried to run it using my account here: https://github.com/finsberg/cppyy-backend/actions/runs/6146025185/job/16674588355 and it is still working for me

wlav commented 1 year ago

Thanks for the reference!

Turns out the token is okay. What failed is that this default: false in .circleci\config.yml makes that config invalid. I don't understand why that doesn't fail for you, because in your successful run, it's set to the same. Changing it to true finally makes the circleci portion run (and complete).

After that, the copy still fails, still because wheel_path is empty, but at least there has been some progress. https://github.com/wlav/cppyy-backend/actions/runs/6152749868/job/16697931339

(I still need to test all the produced wheels. The Mac ARM one wasn't usable b/c the full path to the compiler was embedded, which will require some fixing; others may suffer from the same.)

finsberg commented 1 year ago

Thanks for the reference!

Turns out the token is okay. What failed is that this default: false in .circleci\config.yml makes that config invalid. I don't understand why that doesn't fail for you, because in your successful run, it's set to the same. Changing it to true finally makes the circleci portion run (and complete).

After that, the copy still fails, still because wheel_path is empty, but at least there has been some progress. https://github.com/wlav/cppyy-backend/actions/runs/6152749868/job/16697931339

(I still need to test all the produced wheels. The Mac ARM one wasn't usable b/c the full path to the compiler was embedded, which will require some fixing; others may suffer from the same.)

I belive it is still the same problem. Changing default to true just tells circle ci that the pipeline should allways run when you push to the repo (which is probably not what you want). You could try to debug this locally by running

python3 circleci.py job --token <add the token here>

which might reveal some more information. And it might be useful to print out res.read() before decoding it in https://github.com/wlav/cppyy-backend/blob/8c774b1d41a37f72de95e0ad443d7cb3129bad65/circleci.py#L72

wlav commented 1 year ago

The problem appears to be simpler, but I'm no closer to a solution (I tried playing with quotes, but saw no difference).

The "Get wheel from CircleCI artifact" step, although marked as successful actually fails with:

Run export WHEEL_PATH=$(python circleci.py artifact --job-number $JOB_NUMBER --token ***)
usage: circleci.py [-h] [--token TOKEN] [--vcs VCS] [--org ORG]
                   [--project PROJECT] [--job-number JOB_NUMBER]
                   [--build-aarch64-wheel BUILD_AARCH64_WHEEL]
                   {artifact,job}
circleci.py: error: argument --job-number: expected one argument

so it's $JOB_NUMBER that is somehow the empty string.

wlav commented 1 year ago

Just an update to show I'm not forgetting about this. :)

Setting of the job number fails way earlier b/c the JSON result isn't properly formatted:

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

Still debugging ...

I'm trying out the wheels at the same time, but so far, all of them are broken. The Linux ones are manylinux1, which can't be used as is b/c of an ABI change starting gcc5 (and manylinux1 isn't supported anymore anyway). I've yet to be able to convince the system to run manylinux2014 instead. The Mac ARM one for some reason builds binaries for the wrong architecture (x86_64). I have yet to test the Windows ones.

wlav commented 1 year ago

The MacOS thing at least is understood (and similar issues are probably on some of the other builds): the runner is an x86_64 machine that cross compiles. Although cibuildwheel will setup the proper configuration, Cling is in turn build using cmake which re-derives the toolchain and picks the host as its target, ie. it selects x86_64.

wlav commented 1 year ago

Doh, I finally figured out the job number issue:

branch: str = "build-wheels-with-cibuildwheel"

which only exists in your repo. The error message is not in proper JSON format, after which a Python exception is raised and the job number data is empty.

wlav commented 1 year ago

I'm going to punt on this one. It's been very useful to catch some compiler errors on different platforms early and to resolve the issues some people experienced on Mac M1/2. However, overall it's a net negative time sink as these CI platforms/builds have defaults that simply don't work for me.

I've removed the PyPy builds as they're not necessary. The Linux builds default to the old pre-C++11 ABI and with no amount of work have I been able to change their mind. The MacOS ARM build is cross-compiled, which doesn't work b/c some of the final tools are used in the build process, i.e. they won't run if cross-compiled and aren't useful if not. The Windows builds fail b/c somewhere there's a compiler setting to prevent exporting symbols, so cppyy-backend won't link. Finally, I still can't get the circle-ci job to trigger and I suspect it will suffer from the same ABI problem regardless. (I haven't tried the Mac native build yet.)

Thanks for the effort, though! It would have been nice, but contrary to other Python extension modules (which hide all symbols by design), cppyy needs external (C++11) linkage and it's clearly to difficult to convince these build systems otherwise.