conda-forge / conda-forge.github.io

The conda-forge website.
https://conda-forge.org
BSD 3-Clause "New" or "Revised" License
132 stars 276 forks source link

roll-out of alma8-based infrastructure (aka centos 8) #1941

Open beckermr opened 1 year ago

beckermr commented 1 year ago

This issue tracks centos 8 implementation PRs

The alma 8 repo is https://repo.almalinux.org/almalinux/8.7/BaseOS/x86_64/os/Packages/ etc.

jaimergp commented 1 year ago

xrefs:

jaimergp commented 1 year ago

What I understood was:

And maybe I am missing something else, but we can probably merge the discussion with the CentOS 8 thread mentioned in my comment above.

beckermr commented 1 year ago

Someone, I think @isuruf, mentioned listing versions for other things. libm maybe?

h-vetinari commented 1 year ago

The cosN (CentOs version N) prefixes we had in some places will be replaced by conda_X_YZ (with X.YZ being the glibc version)

Manylinux uses glibc major.minor for its versioning, and I think it's good to match that (I think in the core call the mood was that conda_2_28 would be the best option).

That said, not least after the ill-fated Debian-based manylinux_2_24, in effect it's been just a way to encode the RHEL major version: manylinux glibc RHEL
manylinux1 2.5 5
manylinux2010 2.12 6
manylinux2014 2.17 7
manylinux_2_28 2.28 8

That's because manylinux cannot bring its own updated compiler stack along and so is dependent on having the devtoolset backports. Perhaps keeping RHEL X as a base (through one of its ABI-compatible derivatives like Alma,Rocky,UBI) solves the other versioning questions, even if we call it conda_2_28?

beckermr commented 1 year ago

Yes. We plan to keep alma8 as the base.

isuruf commented 1 year ago

One plus about just using conda instead of conda_2_28, is that the user does not have to deal with cdt_name in the recipes. They can just use the versioning of the sysroot and the cdt_name will become obsolete.

beckermr commented 1 year ago

We'll need to tuck the cdt name in the CDTs somewhere maybe? Idk if the CDT package names will conflict or not.

beckermr commented 1 year ago

Can you send an example recipe where people are dealing directly with cdt_name? I thought the jinja2 function took care of that.

isuruf commented 1 year ago

For eg: https://github.com/conda-forge/qt-main-feedstock/blob/main/conda-forge.yml#L18-L19

isuruf commented 1 year ago

On the other hand, that line in conda-forge.yml serves two purposes. Setting cdt_name and the docker image name.

beckermr commented 1 year ago

Hmmmm. Being able to at least match cdts to the os for our docker containers is inherently useful possibly? This would argue for using conda_2_28 in both places.

isuruf commented 1 year ago

Being able to at least match cdts to the os for our docker containers is inherently useful possibly?

That's not really needed. We use cos6 CDTs in cos7 docker images.

beckermr commented 1 year ago

I'm not saying we alway have to have them matched. I'm saying that using the same notation in both places is helpful.

jakirkham commented 1 year ago

Would it help to start a PR for Docker images? Or are we not ready for that yet?

beckermr commented 1 year ago

Go for it but I don't expect it to be merged anytime soon.

jakirkham commented 1 year ago

Gotcha, what do we see as the required steps before they are merged? Asking since they wouldn't be integrated anywhere by just publishing the image. Or is there something else I'm missing?

beckermr commented 1 year ago

I am not sure what goes in them. If we need the sysroots to put in them, then we'd need that. If they don't have anything special, then we can just build it.

jakirkham commented 1 year ago

Gotcha

Don't think the sysroot is needed

The images cache the compiler packages as a convenience, but that can be disabled temporarily or it can use older compilers for now. Not too worried about this

Can't think of anything else of concern

If something comes up when we start working on them, we can always discuss

jakirkham commented 1 year ago

Started adding an image in PR ( https://github.com/conda-forge/docker-images/pull/235 )

jakirkham commented 1 year ago

From Matt, we need to update os_version for conda_2_28. Also there is a corresponding way to do this in staged-recipes

xref: https://github.com/conda-forge/conda-smithy/pull/1434

h-vetinari commented 1 year ago

adjust smithy to allow for easy alma8 config

What's the intention here? Being able to switch the image? (I'm asking because I'd be willing to give it a shot...)

FWIW, the current 2.17 setup doesn't need any smithy interaction, it's enough to just add sysroot_linux-64 2.17 # [linux64] to the build dependencies. AFAIR that's because we switched the images to cos7, while still using the cos6 sysroot by default. This setup was the result of a bunch of discussions around resolver errors and other issues with older images if there are packages built against the newer sysroot (for details see this summary of the very clever setup proposed by Isuru);

Is there a reason we couldn't switch the images to alma8, but keep the sysroot at cos6 (with opt-in upgrade to cos7 & alma8)?

beckermr commented 1 year ago

There is no reason we couldn't bump images. This may break builds using yum requirements if the packages have changed names or conventions upstream.

h-vetinari commented 1 year ago

This may break builds using yum requirements if the packages have changed names or conventions upstream.

According to this search, there's around 170 recipes that use yum_requirements.txt (for the most part, it's mesa, x11, etc.). I guess it would be possible to audit those for any name-changes, however, given that Alma 8 intends to be bug-for-bug compatible with RHEL 8, I strongly doubt that packages would changes names TBH.

beckermr commented 1 year ago

Fair point. I'm happy with simply bumping the default image.

h-vetinari commented 1 year ago

There seems to be something going awry with the new repodata hack. I'm getting Alma 8 kernel headers together with the COS 7 sysroot on aarch:

The following NEW packages will be INSTALLED [selection]:

    _sysroot_linux-aarch64_curr_repodata_hack: 4-h57d6b7b_13             conda-forge
    kernel-headers_linux-aarch64:              4.18.0-h5b4a56d_13        conda-forge  # <- kernel version in RHEL 8 !!
    sysroot_linux-aarch64:                     2.17-h5b4a56d_13          conda-forge  # <- glibc version in RHEL 7

Interestingly, this is not happening on PPC, where I get:

The following NEW packages will be INSTALLED [selection]:

    _sysroot_linux-ppc64le_curr_repodata_hack: 4-h43410cf_13             conda-forge
    kernel-headers_linux-ppc64le:              3.10.0-h23d7e6c_13        conda-forge  # <- kernel version in RHEL 7
    sysroot_linux-ppc64le:                     2.17-h23d7e6c_13          conda-forge  # <- glibc version in RHEL 7
beckermr commented 1 year ago

What makes you think it is the repodata hack?

h-vetinari commented 1 year ago

What makes you think it is the repodata hack?

It must be related to the sysroot, where the kernel-headers are built, and I couldn't see a difference between aarch/ppc in https://github.com/conda-forge/linux-sysroot-feedstock/pull/46. Both variants are pulling in crdh 4, but looking a bit closer, that divergence between aarch & ppc goes back much further. Seems we've been using newer kernel headers on aarch since https://github.com/conda-forge/linux-sysroot-feedstock/pull/15 (corresponding to linux version in RHEL 8, but apparently still being downloaded through CentOS 7 repos[^1]). Seems it's not critical.

[^1]: though that PR doesn't document the rationale, so I can't really say why it diverged at all

jakirkham commented 1 year ago

This is worth a read

https://almalinux.org/blog/impact-of-rhel-changes/

beckermr commented 1 year ago

Ugh.

h-vetinari commented 1 year ago

I was waiting for that kind of statement after the news hit yesterday. TBH, I can kinda understand the decision from a business POV.

In any case, we could very easily move to rhubi 8 (Red Hat Universal Base Image); it was already a strong candidate in the discussion[^1] for manylinux_2_28, where it only fell short due to lack of the gcc-devtoolset backports. However, we don't need those because we have our own compilers (aside from the fact that they've been added in the meantime).

[^1]: check out the table in the OP

jakirkham commented 1 year ago

Thought we couldn't install packages with yum on the UBI images? This is one of the things needed by feedstocks (particularly with GUI tests using X11)

In any event maybe let's see how AlmaLinux handles this issue. We would have needed to add new CDTs for it anyways. So now what they will be based on could be a bit different (though that was the case anyways when moving distro & version)

beckermr commented 1 year ago

Per the discussion today, we decided to take the "wait and see" approach advocated by @jakirkham above on what to do for glibc >= 2.24 / centos 8 / alma8 support.

h-vetinari commented 1 year ago

Thought we couldn't install packages with yum on the UBI images? This is one of the things needed by feedstocks (particularly with GUI tests using X11)

Taking the rpms appearing on the first page of searching in recipe/yum_requirements.txt in conda-forge, and trying to install them in redhat/ubi8, I get:

$ docker run --rm -it redhat/ubi8
[root@6e8d1fb61da2 /]# yum install dbus-libs httpd-devel libSM-devel libX11 libX11-devel libXau libXcomposite libXcursor libXcursor-devel libXdamage libXext libXext-devel libXfixes libXi libXrender libXrender-devel libXxf86vm libglu1-mesa libselinux libx11 libxcb libxkbcommon-x11 mesa-dri-drivers mesa-libGL mesa-libGL-devel mesa-libGLU-devel numactl-devel xorg-x11-server-Xvfb
[...]
Error: Unable to find a match: libSM-devel libXcursor-devel libXext-devel libXrender-devel libglu1-mesa libx11 libxkbcommon-x11 mesa-dri-drivers mesa-libGL-devel mesa-libGLU-devel numactl-devel xorg-x11-server-Xvfb

Translating this into a more readable list which ones are (un)available in ubi, at least by default:

dbus-libs
httpd-devel
libSM-devel             # unavailable
libX11
libX11-devel
libXau
libXcomposite
libXcursor
libXcursor-devel        # unavailable
libXdamage
libXext
libXext-devel           # unavailable
libXfixes
libXi
libXrender
libXrender-devel        # unavailable
libXxf86vm
libglu1-mesa            # unavailable
libselinux
libx11                  # unavailable; note libX11 is available
libxcb
libxkbcommon-x11        # unavailable
mesa-dri-drivers        # unavailable
mesa-libGL
mesa-libGL-devel        # unavailable
mesa-libGLU-devel       # unavailable
numactl-devel           # unavailable
xorg-x11-server-Xvfb    # unavailable

What stands out is that libX<foo>-devel seems to be mostly unavailable, but libX<foo> is. Not sure if those are packaged differently than on RHEL 7, or genuinely missing. Also, while 3/4 mesa-related packages fail, mesa-libGL is available.

I cannot tell if this would suffice for our purposes, but at least it should be clear that we can yum-install stuff.

jaimergp commented 1 year ago

More news about RHEL: https://techcrunch.com/2023/07/11/why-suse-is-forking-red-hat-enterprise-linux/

h-vetinari commented 1 year ago

SUSE's effort will be a fork and not a clone (presumably based on CentOS stream...? SUSE already has an enterprise offering - very curious that they'll provide a knock-off of the competition...?), which loses the whole bug-for-bug compatibility guarantees with RHEL. But the whole situation is clearly something that has caused a lot of a stir so this'll probably take a while to settle.

I'd prefer to not wait that long with our sysroot. IMO we should move forward with one of the RHEL clones in its current state, as switching between those clones down the line will stay compatible. rhubi would be the safest (if we can make it work), but we could also pretty easily use CentOS Stream (still supported for a while) or even Alma in its current state.

I don't think Debian 10 (also using glibc 2.28) is a realistic alternative, as its EOL is 2024-06-30, which is only a month after the EOL of CentOS Stream 8, so then we could just us that without having to switch between distributions (yum vs. apt; different package names, etc.).

jakirkham commented 1 year ago

Maybe we could try out the UBIs in a feedstock? Idk how compatible they will be with the different RHEL clones, but at least they will be compatible with RHEL

Wonder how the clones are going to handle this? Maybe they will create a set of shared sources they work with? If so, sticking with AlmaLinux could be fine

Since we are discussing yum package availability and weighing our options, it is worth trying maybe one of the harder yum cases we can think of. Naively would start with something like qt-main or maybe qtconsole

Though we could also systematically generate all packages we get from yum by looping over feedstocks and collecting yum_requirements.txt into a set of packages

h-vetinari commented 1 year ago

Wonder how the clones are going to handle this?

Alma has decided to no longer aim to be 1:1 compatible with RHEL, but will still aim to be ABI-compatible (which should normally be good enough for us). But if they're planning to use the CentOS Stream sources for that, then those go EOL much earlier than they promise to. Not sure yet how that'll shake out.

Maybe we could try out the UBIs in a feedstock? Idk how compatible they will be with the different RHEL clones, but at least they will be compatible with RHEL

I continue to think that UBI is probably the best option. All of those under discussion will be ABI-compatible anyway, but UBI has the fewest question marks around availability/stability, resp. the longest EOL.

jakirkham commented 1 year ago

Also with yum packages it might be worth revisiting if some of these can move over to Conda packages. We have a pretty wide set of X11 packages, which should eliminate the need of getting these from yum

The missing mesa packages are a bit worrying since those are needed by packages that have Graphics components. AFAIK they are not easily replaceable so will need some thought

The other wrinkle to keep in mind is CDTs are built from yum packages (and in some cases yum sources). So what RHEL provides (or what clones use) directly affects the CDTs

beckermr commented 1 year ago

The other wrinkle to keep in mind is CDTs are built from yum packages (and in some cases yum sources). So what RHEL provides (or what clones use) directly affects the CDTs

I :100: agree here. This is an important point. CDTs smell funny, but I don't want us crippled by what RH provides in the UBI.

h-vetinari commented 1 year ago

We have a pretty wide set of X11 packages, which should eliminate the need of getting these from yum

Things are moving to Wayland anyway - slowly but surely -, so perhaps we can indeed get rid of some things there.

Maybe even packaging mesa itself could work? Their installation instructions do cover running against a non-system build, so it sounds like it may be possible to pull off?

In any case, if UBI is not fully featured enough in terms of yum packages, I feel we might just go with CentOS stream for now. If Alma has found a way to keep going by the time stream goes EOL, we can still switch because things are (and will remain) ABI compatible

h-vetinari commented 1 year ago

Looking at the most recent alma update (about a month ago) again, it does seem like they'll be focussing on CentOS Stream. They're not putting it that directly, but

We will also start asking anyone who reports bugs in AlmaLinux OS to attempt to test and replicate the problem in CentOS Stream as well, so we can focus our energy on correcting it in the right place.

certainly points in that direction. Following CentOS Stream will keep them ABI-compatible (which is enough for us), but not the previous bug-for-bug compatible. It's also not clear to me how they'll keep going after a given version of CentOS Stream goes EOL (much earlier than the respective RHEL version), but the parts we need are hardly going to keep changing much anyway.

I couldn't make the core call today, but I see in the notes that:

SUSE as an option potentially? Will wait and see; still unclear where everything stands

I would stay with the RHEL clones. Even CentOS Stream would be good enough, if we just keep using it past its EOL (like we've been doing for CentOS 6 as well). It's not like we get a noticeably longer EOL with OpenSUSE (the latest 15.5 is EOL in Dec. '24, barely half a year more than CentOS Stream 8).

Finally, we could still decide to do OpenSUSE independently of whether we finish off the 2.28 Alma sysroots now, because OpenSUSE is on glibc 2.31, so they would be entirely separate.

beckermr commented 1 year ago

@h-vetinari I meant using this: https://www.suse.com/news/SUSE-Preserves-Choice-in-Enterprise-Linux/

h-vetinari commented 1 year ago

Yeah okay, sorry for misinterpreting that one. For that initiative, aside from opensuse cannibalising its own Enterprise Linux product (so who knows how long that remains attractive to them), they also only have the sources of CentOS stream to leverage for ABI-compat. Once the updates there stop, I don't see how this is any different from any of the other RHEL clones.

More importantly, the clones are all ABI-compatible (which is the number one argument to have a clone in the first place), and likely very tightly derived from RHEL sources and/or CentOS stream sources. So it's very likely that we could change the RHEL clone behind our sysroot (and even CDTs), and it would not be an issue.

So I think we can't really do much wrong by going with Alma for now. Either they find a way to keep going past Stream's EOL, or there's another ABI-compatible distro to switch to that does, or we just accept that the image doesn't receive updates anymore. None of these sound terrible IMO.

beckermr commented 1 year ago

Right yeah. My only thought might be that folks could choose to be bug-for-bug compatible with their distro in which case we might have a preference. Only time will tell though. At the ABI level, totally agreed.

h-vetinari commented 1 year ago

I think no clone will realistically be able stay bug-for-bug. Alma already explicitly abandoned that.

It's not realistic to do without knowing the RHEL sources, which no-one gets anymore (well, and if you sneak them out of a subscription and into your distro, RHEL can trivially observe in the open distro sources whether someone is picking up their internal patches, and then come and hound you).

The only way to do it would be to reverse engineer the bug fix from the bug report and profiling RHEL behaviour, but ain't nobody got time for dat.

It's just too much risk and effort for too little gain I think. Especially as ABI-compat is IMO orders of magnitude more important than whether a given bug behaves exactly the same as on RHEL.

Beyond that, I don't see how bug-for-bug compat would be useful to conda-forge. We need the ABI, but I think we can agree that we'd generally like to have less bugs, not more (and most of our linux users will be on way newer distros that carry those fixes already). So from that POV, following CentOS stream would be attractive, were it not for the shorter EOL.

Luckily, keeping the ABI while backporting their own bugfixes is something that is at least realistically feasible for the clones, so they actually have a chance to extend that EOL as promised, beyond what they get from stream. Overall, I still think the best choice is going forward with Alma now.

jakirkham commented 1 year ago

IIRC in our last conda-forge meeting it sounded like we are ok going ahead with AlmaLinux 8. Did I understand correctly?

If so, it sounds like it comes down to doing these next steps (particularly upgrading CDTs). Does that sound right?

beckermr commented 1 year ago

Yes but see the list above. We are trying to get rid of as many CDTs as we can.

h-vetinari commented 1 year ago

We have merged https://github.com/conda-forge/docker-images/pull/242 recently. What are some next steps here? Are we ready to tackle the first CDTs (modulo https://github.com/conda-forge/cdt-builds/issues/66)?

beckermr commented 1 year ago

We agreed to get rid of as many CDTs as we could. So that's the next step. To go through them and figure out what we can build ourselves.

jakirkham commented 1 year ago

Maybe an earlier step is just getting a list of CDTs we use so we can search conda-forge for fuzzy matches