conda-forge / conda-forge.github.io

The conda-forge website.
https://conda-forge.org
BSD 3-Clause "New" or "Revised" License
120 stars 269 forks source link

META: `{{ stdlib("c") }}` migration #2102

Open h-vetinari opened 4 months ago

h-vetinari commented 4 months ago

If you got directed here by the linter...

A short overview of the situation and what to do can be found is in this announcement. For the gory details, see below...


One of the few things that conda cannot ship sanely & safely is the C standard library that is an integral part of the operating system (certainly on UNIX), and very ABI-sensitive. This is also the reason why distributions that do ship the C standard library (glibc on linux) stay on the same version for the lifetime of that distribution. Which in turn is why the glibc version is the "clock" measuring distro age in PEP600, and thus manylinux wheels (statistics).

For almost the entire duration of conda-forge's existence (AFAIU; some of this is way before my time), the underlying C standard library has not changed from:

Given that glibc 2.12 is almost 14 years old (and CentOS 6 is EOL for over 3 years), and that macOS 10.9 is EOL since over 7 years (with Apple aggressively pushing users to upgrade), a lot of modern packages are starting to require newer, sometimes way newer glibc versions (the reason we were able to hold out so long w.r.t. OSX is that we ship our own C++ standard library, but we cannot do that for C). With the breadth of feature expansion in the C23 standard, this will also continue well into the future, as projects eventually start picking up these things.

For packages already requiring a newer C stdlib, there have been work-arounds in place for a long time:

But these are ultimately hacks, and not ready to support a wide-scale roll-out. This is what has blocked us from releasing libcxx 17.0 for example, since it requires macOS >=10.13, and would virally spread that requirement to essentially all packages that use C++ in any of their dependencies (which means +/- all of conda-forge).

The need to move to greener pastures has been clear for a long time, though the work involved is painful and it wasn't clear how to do it exactly. In any case, we've already announced that:

The latter announcement already explicitly added a caveat that we need to figure out how to do this first, because the fact that we've never really bumped the C stdlib baseline version on linux or osx, means that it's crept into lots of places as a silent assumption waiting to break. The most critical thing being correct metadata for the produced packages - in particular, if some package requires a newer C stdlib than a user has (as identified by the __glibc / __osx virtual packages; which need at least conda >=4.8), the package should not ever be installed.

After several discussions in cf/core, as well as with people from Anaconda & Prefix, we now have identified a way forward which involves a new jinja-function {{ stdlib("c") }} to match {{ compiler("c") }}. This is a welcome increase in expressivity, as indeed the version of the C standard library and compiler are very distinct things (as for other languages as well...), but with the substantial downside that no recipes are using this functionality yet. This means we have to add this function to essentially all compiled recipes, to ensure we have the mechanics in place for later bumping the minimum required version (while also allowing selective opt-ins to newer or older versions per feedstock).

The function is designed in many ways in parallel with the {{ compiler("...") }} jinja, and so the corresponding keys in conda_build_config.yaml should be unsurprising:

c_stdlib:
  - sysroot                    # [linux]
  - macosx_deployment_target   # [osx]
  - vs                         # [win]
c_stdlib_version:              # [unix]
  - 2.12                       # [linux and x86_64]
  - 2.17                       # [linux and not x86_64]
  - 10.9                       # [osx and x86_64]
  - 11.0                       # [osx and arm64]

The required functionality was merged into conda-build as of 3.28, but some bugs are still being ironed out and it'll likely not yet be fully functional until the upcoming conda-build 24.3.

This issue is intended for discussion, questions, etc. for this entire effort, beyond the specifics that will arise for glibc 2.17 resp. macos 10.13.


h-vetinari commented 4 months ago

@conda-forge/core, I think this would be worth an announcement?

h-vetinari commented 3 months ago

@conda-forge/core, the blockers for https://github.com/regro/cf-scripts/pull/2135 are all close to being resolved. In case someone still has feedback on that PR, please let me know! I'm planning to start this as soon as the we have smithy and https://github.com/regro/conda-forge-feedstock-check-solvable done.

h-vetinari commented 3 months ago

Exciting times - the piggy back migrator was merged, and should start working in the next couple of hours in conjunction with the boost 1.84 migration. 🤩🥳

I'll try to keep an eye on this, but please ping me (or comment here) if something seems to be going wrong!

h-vetinari commented 3 months ago

I went through the open PRs of the boost migration today, and I think we've reached a point where we could switch on the piggyback for all migrations now. In any case, if you find an issue with what the bot is proposing (w.r.t. stdlib-changes), please let us know in https://github.com/regro/cf-scripts/issues/2328.

carterbox commented 3 months ago

Are the Windows pinnings missing c_stdlib_version? It looks like this key exists for unix, but not Windows. I noticed as part of this migration, that using the stdlibc template on Windows causes two Windows compiler wrappers to be installed in my build environment. I assume because the compiler is pinned to vs2019, but the stdlib floats to the latest version (vs2022).

Something like:

c_stdlib_version:
  - 2019                     # [win and x86_64]
  - 2022                     # [win and arm64]
h-vetinari commented 3 months ago

Are the Windows pinnings missing c_stdlib_version?

AFAIU, this shouldn't be necessary (kinda like we don't have c_compiler_version on windows either). That said, the double-compilers are a bit weird (even though vs2019 gets chosen correctly as the compiler according to the logs).

CC @isuruf, since this was based on his suggestion originally.

h-vetinari commented 3 months ago

FWIW, pulling in vs2022 is problematic in the sense that the windows STL (C++ standard library) requires clang >=16, and any recipes using clang <16 + win are broken by pulling in newer vs2022.

There also seem to be some problems in gazebo w.r.t. pulling in vs2022 (CC @traverso), so considering all that, I'm going to add the version pins for windows for now (which stops pulling in both vs2019 & vs2022, and fixed compilation in https://github.com/conda-forge/dcgp-python-feedstock/pull/24, for example).

h-vetinari commented 3 months ago

Another issue related to pulling in vs2022 in arrow:

-- Providing CMake module for zstdAlt as part of Gandiva CMake package
CMake Error at src/gandiva/precompiled/CMakeLists.txt:44 (message):
  Unsupported MSVC_VERSION=1938
carterbox commented 2 months ago

AFAIU, this shouldn't be necessary (kinda like we don't have c_compiler_version on windows either).

I would say that is because the version (year) of the msvc package is part of the package name whereas gcc and clang do not include the version in the name.

c_compiler:
  - gcc                        # [linux]
  - clang                      # [osx]
  - vs2019                     # [win and x86_64]
  - vs2022                     # [win and arm64]
c_compiler_version:            # [unix]
  - 12                         # [linux]
  - 16                         # [osx]
  - 10                         # [os.environ.get("CF_CUDA_ENABLED", "False") == "True" and linux]
  - 11                         # [os.environ.get("CF_CUDA_ENABLED", "False") == "True" and linux]
isuruf commented 2 months ago

That was my bad. We need a repodata patch that does https://github.com/conda-forge/vc-feedstock/pull/75/files

xhochy commented 2 months ago

It seems that now builds that have previously installing both vsYYYY versions fail now with a solving error. In my case, this is due to rust forcing 2019 https://github.com/conda-forge/rust-activation-feedstock/blob/492cf0058a561fbe9e2666099a4adb77bbe2d0ab/recipe/meta.yaml#L43

We shouldn't have both at the same time, i.e. we need to change rust. But how should the rust-activation be setup now?

h-vetinari commented 2 months ago

But how should the rust-activation be setup now?

It should be fine to use vs_{{ cross_target_platform }}, which will use either vs2019 or vs2022.

BastianZim commented 1 month ago

Hi everyone, just one small thing about the linter. In https://github.com/conda-forge/poetry-feedstock/pull/99#issuecomment-2101613350 we received a lint hint that we should add the variable because we are using __osx but it's noarch and just platform-specific noarch. I guess the linter is wrong but I just wanted to confirm that is the case.

jakirkham commented 1 month ago

Thanks for the pointer Bastian! 🙏

Agree this is a linter bug. Replied in that thread: https://github.com/conda-forge/poetry-feedstock/pull/99#pullrequestreview-2048476804

h-vetinari commented 1 month ago

I guess the linter is wrong but I just wanted to confirm that is the case.

No, it's a linter bug, sorry about that. 😅 I had forgotten about the noarch-but-on-different-platforms case. I'll fix it later today.

jakirkham commented 1 month ago

Looks like Bastian filed the linter noarch bug as issue: https://github.com/conda-forge/conda-smithy/issues/1924

LourensVeen commented 1 month ago

I'm running into problems with the openmpi package, which requires __glibc>=2.17. It works fine when installed in isolation, but if I also install gcc, then sysroot_linux-64==2.12 gets installed, and openmpi breaks because the glibc 2.12 from sysroot shadows my newer system glibc. See https://github.com/conda-forge/openmpi-feedstock/issues/143#issuecomment-2135519831 and the rest of that issue.

If the sysroot packages are going to continue to exist, should there be an automatic run_constrained requirement on sysroot matching the glibc version set by c_stdlib_version if stdlib('c') is present? Or is there something wrong with the way openmpi is built?

h-vetinari commented 1 month ago

Thanks for the report! The easiest fix where something is wrongly picking up glibc 2.12 (either in a feedstock involving CUDA, or because some symbols are actually needed at runtime) is to add:

os_version:
  linux_64: cos7

to conda-forge.yml and then rerender.

LourensVeen commented 1 month ago

I'm not sure anything is actually wrong here, there seems to just be an unfortunate interaction, or maybe an incomplete upgrade from 2.12 to 2.17.

The gcc package depends on gcc_impl_linux-64, which depends on sysroot_linux-64 without a version constraint, and the package with the highest build number there is sysroot_linux-64 2.12 he073ed8_17. So if you conda install gcc you get glibc 2.12 along with it.

I'm not sure this is intentional (the sysroot build numbers seem to all have started at 1, as opposed to there being a 100+n hack expressing a preference), but gcc is used to build packages on Linux at least, and it would make sense for it to install 2.12 to ensure that any built packages require only 2.12 by default, regardless of what's available in the OS used for the build.

Unfortunately, this means that installing gcc breaks any packages that require a higher glibc version than 2.12, because they'll end up dynamically linking against glibc 2.12 from sysroot, as opposed to linking against the usually much newer system glibc.

I'm not sure I understand what you're writing above, are you suggesting that that change needs to be made to ctng-compilers-feedstock in order to make it depend on glibc 2.17, and thus sysroot 2.17, which would then fix the problem? If conda-forge is dropping support for glibc 2.12 then that could be done, but if there are packages that require glibc >2.17 then they'd still break as soon as you install gcc.

beckermr commented 1 month ago

It isn't highest build number since 2.12 < 2.17 and the solver would want a higher version. We have extra features and another package that weigh down the newer sysroots. Once the stdlib migration is done, I think we'll be able to remove those extra hacks. This may help.

LourensVeen commented 1 month ago

Ah, that makes sense, there are still many things in Conda I don't understand, and features is one of them. I do think that having everything at 2.17 will solve my immediate problem with OpenMPI, and I can work around it with a build dependency on the newer sysroot I think.

I'm not sure it will solve the problem in general though. As pointed out above there are already packages that require a newer glibc than the current standard for one reason or another. One possibility is of course to limit conda-forge to the newest version that will compile with glibc 2.17 or whatever the standard is of the time, but that means limiting all users to what the most obsolete users can support. I guess that's the current situation? Seems to mostly work, especially if the standard glibc can be updated a bit more frequently once the automation is in place.

Having thought about it some more however, I think the general problem is this:

When a user tries to install B on a machine with __glibc>=2.25, then they get B and A and all is well. If the machine has __glibc<2.25, then they'll get an error because A has an unresolveable dependency and B depends on A. So far so good.

But when we're building B, we'll have A as well as a compiler and a sysroot (assuming that the sysroot package will continue to exist?), and assuming that the sysroot is conda-forge standard, B will fail to link as A contains unresolved symbols that are not available in the sysroot glibc that the linker is linking against.

One option would be for B to specify c_stdlib_version 2.25, but that's weird because it doesn't itself require that, and you would have to manually keep track of all your dependencies and what they require. Instead, if setting c_stdlib_version to 2.25 would automatically add a run_constrained on sysroot >=2.25, then A would have such a constraint in its metadata.

When installing B, this wouldn't change anything, as there's no sysroot installed. But when building B, the compiler would pull in sysroot (without a version constraint), and the run_constrained on A would then ensure that it's new enough to support A. As a result, B may end up with unresolved symbols from the newer glibc (and a corresponding dependency on __glibc>=2.25), but since A needs them anyway and B needs A, this doesn't change where the software can run.

The only snag I can see is that we'll want the resolver to choose the lowest compatible sysroot version when setting up the build environment, in order to ensure maximum compatibility. That's not how the resolver works though, so this would require some sort of hack...

h-vetinari commented 1 month ago

The only snag I can see is that we'll want the resolver to choose the lowest compatible sysroot version when setting up the build environment, in order to ensure maximum compatibility.

That's the whole idea about {{ stdlib("c") }}. We're now linting for the presence of that because that's exactly the infrastructure that should take care of pinning the sysroot version at build time to a controllable quantity (with a global baseline that can be overridden per feedstock).

If there are other specific problems you experience with openmpi, please open an issue on that feedstock and let's discuss it there.

minrk commented 4 weeks ago

Re-raising the openmpi issue, linking here, it would be good to get some guidance on how/where/when to propagate the fact that building a package with sysroot 2.17 may mean any downstream package that links to it must also build with at least that sysroot version. That means e.g. openmpi should have

run_constrained:
  sysroot_{{ target_platform }} >=2.17

but this isn't in run_exports for sysroot, so openmpi can currently be installed in build envs that are picking up the default sysroot 2.12, which results in failed linking looking for memcpy@GLIBC_2.14 etc.. My question is really:

  1. is this an openmpi-specific problem, or general issue with newer c_stdlib than default?
  2. should we put this run_constrained on sysroot on openmpi?
  3. should sysroot have this run_constrained in its run_exports?

Feel free to continue the discussion in the openmpi issue, but I think some info in the general conda-forge docs on using new c_stdlib may be appropriate.

isuruf commented 4 weeks ago

which results in failed linking looking for memcpy@GLIBC_2.14 etc..

Are you linking in statically? Or is LDFLAGS which includes -Wl,--allow-shlib-undefined not passed in?

beckermr commented 4 weeks ago

There was a recent openmpi pr that overrode the standard compiler flags. Did that get merged?

h-vetinari commented 4 weeks ago

There's some discussion on this in https://github.com/conda-forge/linux-sysroot-feedstock/issues/63.

minrk commented 4 weeks ago

Are you linking in statically? Or is LDFLAGS which includes -Wl,--allow-shlib-undefined not passed in?

Not linking statically, but I think the affected cases do not pass $LDFLAGS. At least some are fixed if $LDFLAGS are passed (i.e. openmpi's own tests, which run with sysroot 2.12 even when built with 2.17 due to the lack of pinning). But they do include things like CMake's FindMPI, which runs various checks to see if the MPI compiler works, which it won't, and CMake doesn't consume LDFLAGS during this stage, I think. (update: CMake was wrongly accused, all affected envs so far either don't have LDFLAGS set (e.g. gfortran but not gfortran_linux-64), or override link commands and drop LDFLAGS (e.g DAMASK, esmf).

Another slightly tangential issue is the possibility of setting all of FindMPI's (many) variables to prevent its discovery stage from making guesses.

Right now, I think most packages that link openmpi have had to bump c_stdlib_version to 2.17 in order to get builds to pass with openmpi 5. I don't know if all of the affected packages have at least one call missing $LDFLAGS.

There was a recent openmpi pr that overrode the standard compiler flags. Did that get merged?

Yes, I made a PR that merged $LDFLAGS into $OMPI_LDFLAGS, so that using the mpi compiler wrappers would use the standard flags. This apparently broke some builds outside conda-forge, though I still don't know how since there are no error messages in the failed builds. I have a PR to roll that back now.