Feature Request: flag to skip build if there is not a numpy wheel available

fgregg commented 11 months ago

Description

A great many C extensions depend upon numpy.

If numpy does not have a binary wheel targeting that platform, version, and architecture, then almost always it is going to be harder for the user to compile numpy from source than the dependent library.

It would be great to add a environmental flag to avoid building wheels for when there is not already a numpy wheel available.

This could be accomplished by either trying to do the build, and bailing out if pip finds there is no wheel available. Alternately, we could use the pypi API to look at all the numpy wheels available in the most recent version of numpy and deduce what to skip.

Almost all the the things i'm skipping in CIBW_SKIP are because of missing numpy wheels.

joerick commented 10 months ago

I wonder if a lot of people have the same issue. It's kinda advanced but I wonder if a neat syntax for this would be something like CIBW_SKIP_IF_FAIL or CIBW_SKIP_UNLESS e.g. CIBW_SKIP_UNLESS=pip install --only-binary :all: numpy. Might be worth considering if lots of people would find it useful.

alexlancaster commented 10 months ago

I wonder if a lot of people have the same issue. It's kinda advanced but I wonder if a neat syntax for this would be something like CIBW_SKIP_IF_FAIL or CIBW_SKIP_UNLESS e.g. CIBW_SKIP_UNLESS=pip install --only-binary :all: numpy. Might be worth considering if lots of people would find it useful.

I like this idea!

I think this would be quite a common use-case. For my project pypop which depends on numpy as an install and test-time dependency, I try and build as many wheels as possible using cibuildwheel that will work for the end-user. But, specifically not building wheels for pypop that might require a dependency (like numpy) to be built from source during installation. That way lies disappointment for the majority of end-users.

But as the status numpy wheels for particular architecture changes (normally more are added), I have to resort-to-trial and error to converge on the exact set of exclusions in skip I need to add to pyproject.toml. Here is my process:

start with not skipping any wheels, build all possible wheels, note the failures, especially due to numpy
add the failed versions due to missing numpy wheels to skip in [tool.cibuildwheel]
periodically check numpy pypi for new wheels if a new wheel is now provided, remove them from from skip.
re-run

Running the full workflow just to check whether a new wheel is error-prone, and dependent on when I think to look at pypi. It would be much better to have it be dynamic, as suggested above with a new CIBW_SKIP_UNLESS (or skip_unless within pyproject.toml). That way, I would only need to include in the skip builds that fail due to reasons like a fundamental architecture incompatibility or similar, rather than skipping wheels that don't install because of the vaguaries of whether the upstream pypi project happens to build that wheel at this moment.

I'm not sure how easy this would be, but it would also be nice if it could be integrated into the parallel-style of cibuildwheel on github action as shown in https://iscinumpy.dev/post/cibuildwheel-2-10-0/. in other words, it would be useful if the CIBW_SKIP_UNLESS could be provided in the generate_wheels github job, so that it would skip the generation of the wheel option in that job, rather than being skipped in the actual build wheel step, so it wouldn't generate empty jobs.

Czaki commented 4 months ago

@joerick @henryiii I think that I may have a time to implement such a feature (maintaining NumPy compatibility is annoying).

What did you think to add CIBW_REQUIRE_WHEELS option to which will use pip install --python-version --only-binary :all: --implementation --dry-run packages (or something similar) to validate if wheels are available and if not, then skip given python version?

agriyakhetarpal commented 4 months ago

Though it's sort of going away and being maintained less, having oldest-supported-numpy in one's [build-system.requires] table might be able to serve the same purpose, cibuildwheel could identify available NumPy versions at wheel build time and alter CIBW_PROJECT_REQUIRES_PYTHON as necessary if not available – because cibuildwheel already reads those values from pyproject.toml

Edit: never mind, I noticed under the README that they posted a "Deprecation notice for NumPy 2.0" section 😅

henryiii commented 4 months ago

That's a deprecated package, and doesn't help here anyway. And cibuildwheel does not read requirements from pyproject.toml.

Having a way to get cibuildwheel to continue on error and list the errors at the end would really help. Anything else isn't that helpful, and I don't want cibuildwheel to ever report a failed build (for any reason) as "okay, just skip it". What happens if you miss-configure and it thinks it can't get binary wheels for all macOS for example? Should that just "pass"? This could be really easy to trigger without meaning to, for example SciPy is now setting a higher macOS version. The solution would be to also increase your macOS target version, not to skip all macOS wheels! You should have to explicitly list all the skips, but you shouldn't have to iterate one at a time to get the skip list.

You can already set PIP_ONLY_BINARY:":all:", I do that in several packages.

agriyakhetarpal commented 4 months ago

I agree that the macOS version thing is a little shoddy because it might not even be detected at runtime—if, say, someone doesn't choose to test their package with CIBW_TEST_COMMAND and publishes their wheels with 12_0, but SciPy would be providing just 13_0 and above (soon, someday), which pip won't be able to install on a macos-12 runner.

Other than that, a common strategy for getting cibuildwheel to continue on error, or at least working around it, is to get the allowed Python versions as JSON and use them in a GHA matrix setup so that wheels are built across Python versions in parallel, as you may have seen already (it might be worth documenting that – I think the only concerns around this are binary size differences and possibly reproducibility?)

Czaki commented 4 months ago

You can already set PIP_ONLY_BINARY:":all:", I do that in several packages.

It sometimes fails for testing, as still some pure python packages provide only sdist.

Having a way to get cibuildwheel to continue on error and list the errors at the end would really help

But waiting on error may be a waste of time. Or you think that better documenting PIP_ONLY_BINARY with listing packages and parsing error log for fail to resolve is a proper solution?

henryiii commented 4 months ago

It sometimes fails for testing

You can use PIP_ONLY_BINARY: numpy, etc. One reason it's good to do this per-package.

But waiting on error may be a waste of time.

You only pay this once, then you put the skips in and never pay the cost again. While if you auto-fail, then every build will have to start up and try to download the wheel, then fail.

I'd like to better document PIP_ONLY_BINARY (and UV_ONLY_BINARY, at a guess?), but I'd like some way to see all the failed builds, rather than failing one at a time.

henryiii commented 4 months ago

(https://github.com/pypa/cibuildwheel/issues/1062)

Czaki commented 4 months ago

You only pay this once, then you put the skips in and never pay the cost again. While if you auto-fail, then every build will have to start up and try to download the wheel, then fail.

No. It is not once paid. As you need to this every time, you would like to check if some dependency already have package.

When python 3.13 is out, then wheels for multiple packages will be waiting on other packages to release wheels. Another good example may be pypy 3.10. There are no such wheels for NumPy now. So one may need to skip it. But with your solution, such person needs to actively trace if NumPy for pp310 is released and then update config.

alexlancaster commented 4 months ago

You only pay this once, then you put the skips in and never pay the cost again. While if you auto-fail, then every build will have to start up and try to download the wheel, then fail.

No. It is not once paid. As you need to this every time, you would like to check if some dependency already have package.

When python 3.13 is out, then wheels for multiple packages will be waiting on other packages to release wheels. Another good example may be pypy 3.10. There are no such wheels for NumPy now. So one may need to skip it. But with your solution, such person needs to actively trace if NumPy for pp310 is released and then update config.

Yes, exactly, you have have to periodically remove the _SKIP to check that the skips are still needed. Since wheels are often eventually built for dependencies, otherwise you end up skipping Python versions that don't need to be skipped. It's a game of whack-a-mole.

pypa / cibuildwheel

Feature Request: flag to skip build if there is not a numpy wheel available #1701

Description