Numpy pinning going forward

h-vetinari commented 1 year ago

Not sure if people saw already, but numpy 1.25 introduced a pretty big change

Compiling against the NumPy C API is now backwards compatible by default

NumPy now defaults to exposing a backwards compatible subset of the C-API. This makes the use of oldest-supported-numpy unnecessary. Libraries can override the default minimal version to be compatible with using:
#define NPY_TARGET_VERSION NPY_1_22_API_VERSION
before including NumPy or by passing the equivalent -D option to the compiler. The NumPy 1.25 default is NPY_1_19_API_VERSION. Because the NumPy 1.19 C API was identical to the NumPy 1.16 one resulting programs will be compatible with NumPy 1.16 (from a C-API perspective). This default will be increased in future non-bugfix releases. You can still compile against an older NumPy version and run on a newer one.

For more details please see for-downstream-package-authors.

(https://github.com/numpy/numpy/pull/23528)

Also from those release notes, numpy is now planning the long-only-mythical 2.0 release as following 1.26 (which is roughly 1.25 + meson + CPython 3.12 support), so we will have to touch this setup in the not too distant future anyway.

We're currently on 1.22 as per NEP29, so AFAICT we could consider using numpy 1.25 with NPY_1_22_API_VERSION as an equivalent setup (this probably needs to go into an activation script for numpy...?).

CC @conda-forge/numpy @conda-forge/core

xhochy commented 1 year ago

Reading this, I see the drawback, that we will have an activation script with numpy and thus some (unexpected) hurdles for maintainers with numpy as a build dependency (if they want a newer numpy version). What would be the benefit of providing numpy=1.25 as the default? I don't see it.

h-vetinari commented 1 year ago

It's perhaps possible to do this without an activation script, that was just the first thing that came to mind...

What would be the benefit of providing numpy=1.25 as the default?

I don't have a strong argument (or preference) here. But whenever we get to numpy>=1.25 as a default, we'd IMO have to adapt the run-export. It would also be a bit weird to jump from (a future) >=1.24 back to >=1.19 (based on the API default of 1.25), but I guess that could be a one-time transition. It also wouldn't match NEP 29 anymore...

rgommers commented 1 year ago

Wouldn't it be better to deal with it for 2.0? That's less than 6 months away, and at that point there is a hard necessity to deal with C API/ABI stuff.

h-vetinari commented 1 year ago

Yeah, that's part of what I wanted to discuss here, not just the backwards compat by default, but also 2.0.

It also doesn't need an immediate decision, there's no urgency AFAICT.

hmaarrfk commented 1 year ago

I think this will be useful with the 2.0 release, we could pin to 2 and set the environment variables like we do for the C compilers at build time.

isuruf commented 1 year ago

I would argue to not move away from the current setup. Even if we set NPY_TARGET_VERSION using an environment variable, there are 2 issues.

It might not get picked up by the build system.
If a project itself sets NPY_TARGET_VERSION, the metadata will not be correct.

However if we build with NumPy 1.25 and have >=1.25, we are guaranteed that the metadata is correct even though it could have been looser.

(This is exactly what we do with macos SDK and deployment target by setting them to the same version by default. For eg: if SDK = 11 and target = 10.9, the symbols introdued in 10.15 are visible, but they need to treated as weak symbols in 10.9 which require the developer to handle it correctly in their C/C++ code)

ocefpaf commented 1 year ago

However if we build with NumPy 1.25 and have >=1.25, we are guaranteed that the metadata is correct even though it could have been looser.

Also, a looser pin in that case is not necessary better. Most users will want an updated numpy anyway and having that in place will make it easier (faster) for the solver to provide a solution with it.

Sure, there may be a small portion of users who may need older numpy and won't be able to install it but I believe the advantages outweigh the disadvantages.

h-vetinari commented 1 year ago

If a project itself sets NPY_TARGET_VERSION, the metadata will not be correct.

Isn't that a general problem that we'll have to look out for in any case?

I'm not sure if that is something we could easily determine from a compiled artefact (numpy does embed the C-API level AFAIK), but it seems it would be good to check after building what numpy target version got used

That way we could verify that things didn't get lost or overridden by the project or the build system.

isuruf commented 1 year ago

Isn't that a general problem that we'll have to look out for in any case?

No. See my comment highlighted below

However if we build with NumPy 1.25 and have >=1.25, we are guaranteed that the metadata is correct even though it could have been looser.

h-vetinari commented 1 year ago

I spoke with @rgommers recently, and he mentioned one thing about this that wasn't clear to me before:

Packages compiled with numpy 2.0 will continue to be compatible with the 1.x ABI.

In other words, if this works out as planned, we could support numpy 2.0 right away without having to do a full CI-bifurcation of all numpy-dependent packages. It would mean using 2.0 as a baseline earlier than we'd do it through NEP29, but given the now built-in backwards compatibility, we could set the pinning to 2.0, and manually set the numpy run-export to do something like numpy >=1.19 (which apparently won't be changed until numpy drops python 3.9 support).

No. See my comment highlighted below

I wasn't talking about the tightness/looseness of the constraints, but about projects setting NPY_TARGET_VERSION in their build scripts somewhere, which has the potential to conflict (in terms of expectations, not constraints) with whatever we do.

h-vetinari commented 1 month ago

Now that we've started migrating for CPython 3.13 (which requires numpy 2.1), the migrator has a major-only pin: https://github.com/conda-forge/conda-forge-pinning-feedstock/blob/a94f54b17dec84bb3d1c70cf8e8dc4e4893e7ac4/recipe/migrations/python313.yaml#L41-L42

While the numpy 2 migrator pins to 2.0 https://github.com/conda-forge/conda-forge-pinning-feedstock/blob/a94f54b17dec84bb3d1c70cf8e8dc4e4893e7ac4/recipe/migrations/numpy2.yaml#L47-L51

Do we want a major-only pin, or still decide when we update the baseline numpy version (in a post-numpy-2.0 world)? For example, using a major-only pin means we'll start pulling in 2.1 as soon as it's available, and this creates a tighter run-export (>=1.21) than 2.0 (>=1.19), which is something we may want to do consciously rather than accidentally. OTOH, both of those are substantially looser than what NEP 29 suggests (c.f. https://github.com/conda-forge/numpy-feedstock/issues/324).

I think both approaches are workable, we should just decide on one or the other.

jakirkham commented 1 month ago

Would suggest the Python 3.13 migrator be updated to use 2.1 instead of 2. Reasons being

NumPy 2.1.0 is the first version to support Python 3.13
Already we are pinning to NumPy 2.0 for older Pythons
As you point out, we want to be intentional about setting lower bounds

Everything else can stay the same

Think if we want to change this more dramatically, we should probably wait for these migrators to complete and reassess. It is always easier to relax things later (as opposed to tightening). Also trying to work in more changes with multiple in-flight migrators is hairy

Though open to discussion if others have different opinions

h-vetinari commented 1 month ago

Would suggest the Python 3.13 migrator be updated to use 2.1 instead of 2.

I support this for the reasons you stated, though at least there's no immediate urgency on this. As there are no numpy 2.0 builds for 3.13, the two are equivalent in this particular case (they wouldn't be for py<313 though, hence my question).

rgommers commented 1 month ago

I'd probably choose the major-only flavor, because that's the actual requirement. But it doesn't really matter either way, since builds are going to be using 2.1 anyway now that that is available.

h-vetinari commented 1 month ago

But it doesn't really matter either way, since builds are going to be using 2.1 anyway now that that is available.

That's not the case; if we pin 2.0, then that's what gets installed in host while building (but 2.1 at runtime of course)

rgommers commented 1 month ago

That's not the case; if we pin 2.0, then that's what gets installed in host while building (but 2.1 at runtime of course)

Major-only meant 2, not 2.0. What I meant was that both 2 and 2.1 will yield 2.1 at build time (until 2.2 comes out of course).

I think 2 is the more correct choice, as it will avoid having to manually bump the minor version all the time.

conda-forge / conda-forge-pinning-feedstock

Numpy pinning going forward #4816

Compiling against the NumPy C API is now backwards compatible by default