conda-forge / pythran-feedstock

A conda-smithy repository for pythran.
BSD 3-Clause "New" or "Revised" License
1 stars 11 forks source link

MAINT: revendor xsimd #76

Closed h-vetinari closed 1 year ago

h-vetinari commented 1 year ago

The current pin was added in https://github.com/conda-forge/pythran-feedstock/commit/3fe8468a15ab286a1fa3f5bda3c71253e39d3aa1 at the time of pythran 0.11 and not touched since. Before that, we pinned even harder, c.f. that commit:

-    - xsimd 7.6.*
+    - xsimd >=8.0.5,<8.1

There are no further comments in https://github.com/conda-forge/pythran-feedstock/pull/63, but I suspect I didn't want to unpin to rashly and break something.

Edit: the thrust of this PR changed based on the discussion below, to revendor xsimd for the time being, rather than have to deal with pinning (sometimes unreleased, patched) xsimd versions, which is the only thing that pythran is testing/supporting at the moment.

conda-forge-webservices[bot] commented 1 year ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

rgommers commented 1 year ago

I guess part of the issue here is the unvendoring of xsimd. As a result, there is no metadata in pythran itself (because the correct working version is always present in third_party/ inside the pythran repo.

For consideration:

serge-sans-paille commented 1 year ago

I trend to agree with Ralf here. As xsimd gets more and more stable, the situation may evolve, but the only setup we currently test is the vendoring one.

h-vetinari commented 1 year ago

@conda-forge/xsimd and @SylvainCorlay specifically:

@rgommers: this means that it's pretty clear that this unvendoring [of xsimd] is a bad idea, and currently unreliable for any package using Pythran. I think it should be reverted. There really are no downsides to vendoring a header-only library.

@serge-sans-paille: I tend to agree with Ralf here. As xsimd gets more and more stable, the situation may evolve, but the only setup we currently test is the vendoring one.

This would revert your request from a quite a while ago: #14 / #15. Any opposition?

Taken for pythran itself, I think the arguments are convincing (not least the header-only aspect!), but I'm not sure how that would interact with other uses of xsimd. The alternative is effectively mirroring the exact state of the xsimd headers per pythran version, which would be a pain e.g. if patches get backported (like https://github.com/serge-sans-paille/pythran/pull/2104 recently)

FYI @isuruf

eli-schwartz commented 1 year ago

Declaring the same dependency but as conda metadata instead of vendored files should fix incompatibilities in general.

on top of that there is the long-standing bug in conda itself with non-Bash shells which means the unvendoring breaks hard on some shells

This could be solved by making pythran refer to xsimd using a symlink.

rgommers commented 1 year ago

but I'm not sure how that would interact with other uses of xsimd.

There should not be a problem there I'd think, unless there's a bug report to point to (not impossible some global symbol leaks, but seems unlikely).

Declaring the same dependency but as conda metadata instead of vendored files should fix incompatibilities in general.

That'd be a problem, because that is a == dependency on xsimd, which is typically not what you want - if another package has an incompatible constraint, the packages will not be installable together.

This could be solved by making pythran refer to xsimd using a symlink.

Doesn't work on Windows, and would be another hack. It'd be better to spend the time fixing the problems conda seems to have with compiler activation in non-Bash shells than doing this kind of thing imho.

SylvainCorlay commented 1 year ago

In general, the policy for conda-forge is to not vendor as much as possible.

h-vetinari commented 1 year ago

In general, the policy for conda-forge is to not vendor as much as possible.

Yes. However, in this case it's causing compatibility problems due to how tightly pythran and xsimd are connected, and it brings essentially no benefit because it's header only.

So we want to revendor despite conda-forge's policy, which I'm well aware of.

The question is whether that would negatively impact xsimd usecases somehow.

eli-schwartz commented 1 year ago

That'd be a problem, because that is a == dependency on xsimd, which is typically not what you want - if another package has an incompatible constraint, the packages will not be installable together.

It's not a problem that causes pythran or xsimd to have mysteriously missing headers. It's just the same problem that happens any time projects are insistent on what version of something they want. This isn't a problem unique to xsimd -- although you actually CAN solve it for C/C++ headers, since you can stuff them in versioned directories and have them be parallel installable, just like having both gcc-10 and gcc-13 installed.

Python software can't do that, of course.

Doesn't work on Windows, and would be another hack. It'd be better to spend the time fixing the problems conda seems to have with compiler activation in non-Bash shells than doing this kind of thing imho.

"solved on really recent versions of Windows", supposedly. :rofl:

You can emulate it though -- adding a stub header into the pythran package that does #include "../../../path/to/global/xsimd/header.hpp" should be compatible with idealistic vendoring policies to the same extent that symlinks are compatible.

eli-schwartz commented 1 year ago

Yes. However, in this case it's causing compatibility problems due to how tightly pythran and xsimd are connected, and it brings essentially no benefit because it's header only.

At least one benefit it brings is the ability to generate an SBOM from the list of conda packages without looking into package-specific details of what is vendored.

The argument that something is special because header only, is a pretty terrible argument IMO. If everyone did this then disaster would quickly ensue and no one would be able to keep track of all their vendored code. There are more ideological reasons to argue against vendored code than just "you don't have to build it multiple times if you already built it once", and that's the only (non-)advantage that header-only libraries have (you have to build it the same number of times regardless of whether it is vendored or not).

The question is whether xsimd<->pythran are special compared to header-only libraries in general. Maybe they are, maybe they aren't, I don't know.

h-vetinari commented 1 year ago

At least one benefit it brings is the ability to generate an SBOM from the list of conda packages without looking into package-specific details of what is vendored.

That assumes perfect unvendoring of everything in conda-forge, which despite our best efforts, is not the status quo. I don't think we advertise being able to generate SBOMs, and we shouldn't IMO, as that comes with a whole lot of rules that I don't think we're ready to fulfill.

The question is whether xsimd<->pythran are special compared to header-only libraries in general.

Using the wrong xsimd headers can break pythran, and as pythran's main developer said above: "the *only* setup we currently test is the vendoring one." And pythran doesn't even necessarily vendor exact versions, but may layer unreleased patches on top (and depend on those for correct operation). This is what happened around the release of 0.13 recently.

Unvendoring here comes with risks of breakage or untested behaviour changes, and for no real gain, as the headers need to get recompiled everywhere anyway, and we don't have an ABI of a compiled library that we have to keep stable across the ecosystem.

Of course I'm generally strongly in favour of unvendoring, but here it seems the net benefit is negative.

eli-schwartz commented 1 year ago

That assumes perfect unvendoring of everything in conda-forge, which despite our best efforts, is not the status quo. I don't think we advertise being able to generate SBOMs, and we shouldn't IMO, as that comes with a whole lot of rules that I don't think we're ready to fulfill.

I don't think that a failure to be perfect everywhere is an indictment on having a policy that it should be so. Surely the solution is to be perfect everywhere, right? So the best way to be perfect everywhere is to start by being perfect in one more place than you were before, not by reducing coverage for the policy compliance.

Being able to scrape that SBOM data from a list of packages is, regardless, very useful seed data for someone checking into their dependencies.

It's also useful outside of producing SBOMs for the same reason it's useful for producing SBOMs -- because people can look into what they are using, track down the sources of bugs, especially platform-specific bugs, etc.

And pythran doesn't even necessarily vendor exact versions, but may layer unreleased patches on top (and depend on those for correct operation). This is what happened around the release of 0.13 recently.

IMHO it would be nice in such a case for projects to follow "release early, release often", especially when they control both sides of the stack. Particularly for bug fixes.

h-vetinari commented 1 year ago

IMHO it would be nice in such a case for projects to follow "release early, release often", especially when they control both sides of the stack. Particularly for bug fixes.

We (conda-forge) don't control how upstream pythran/xsimd do their releases. It's fine to identify ways that things can be improved (and try to convince upstream to adopt those), but right now, the job here is to not ship broken packages for the status quo, not some future ideal.

Surely the solution is to be perfect everywhere, right?

Not without a comprehensive plan of what we're trying to do, otherwise it's just aimless churn that doesn't actually achieve anything (like reliable SBOMs). To my knowledge, no-one is talking about generating SBOMs based on conda-forge metadata, so my point is: SBOMs are not in scope here, by a vast margin of unresolved issues, and so philosophical fidelity with that utopia is completely besides the much more pressing point of dealing with the issues in the here and now.

rgommers commented 1 year ago

The abstract discussion is perhaps distracting a little bit from the current issue, which is that the dependency info is wrong and that has given us concrete and annoying problems. There are only two options here:

  1. keep the unvendoring and add the correct metadata, i.e. a == pin that gets updated for each pythran release
  2. remove the unvendoring

That choice is up to the feedstock maintainers & conda-forge team in the end; (2) seems more pragmatic to me, but (1) would also work. Could we please come to a decision?

h-vetinari commented 1 year ago

Seeing that pythran currently contains an unreleased xsimd version, I'm going with the revendoring. If pythran switches to a model that's more amenable to unvendoring (or people want to create separate, patched xsimd-releases just so we can pin them here), changing this back is just a PR away.