Homebrew / brew

🍺 The missing package manager for macOS (or Linux)
https://brew.sh
BSD 2-Clause "Simplified" License
40.74k stars 9.55k forks source link

Migrate from Ubuntu 16.04/gcc-5 to gcc-6/7 for Linux builds #12565

Closed iMichka closed 2 years ago

iMichka commented 2 years ago

Provide a detailed description of the proposed feature

Our current Linux builds are run with Ubuntu 16.04, gcc-5 and glibc 2.23. If you use bottles, anything newer than this will be fine.

We are also supporting "from source" builds/installations for glibc 2.13 versions and newer. This means that if you use 2.13 <= glibc < 2.23, you can still install homebrew, and build stuff from source if necessary.

This works because binutils is currently build in a special Debian Wheezy container, with glibc 2.13.

Ubuntu 16.04 has reached the end of the standard support period in April 2021.

I propose that we move forward and aim for a build system with a newer gcc.

I would like to gather some opinions on what we should aim for first. Which glibc version? Should we go for gcc-6 or straight to gcc-7 with Ubuntu 18.04?

What is the motivation for the feature?

Do not let technical debt pile up. Also, more and more formulae require gcc-7 at least, so we need to ship brewed gcc, which in principle would not be necessary.

How will the feature be relevant to at least 90% of Homebrew users?

What alternatives to the feature have been considered?

Bo98 commented 2 years ago

Given backwards ABI compatiblity, is there actually anything that needs done except updating CompilerSelector.preferred_gcc?

Similar for glibc, but that would be a formula update.

We are also supporting "from source" builds/installations for glibc 2.13 versions and newer. This means that if you use 2.13 <= glibc < 2.23, you can still install homebrew, and build stuff from source if necessary.

I thought the way it worked was you installed brewed glibc if you have < 2.23 (or rather, it did so automatically) and then everything would be as normal and you can use bottles.

iMichka commented 2 years ago

Given backwards ABI compatiblity, is there actually anything that needs done except updating CompilerSelector.preferred_gcc?

I think ABI compatibility is fine. We had issues between gcc-4 and 5, but from 5 to 6 or 7 (or even newer) should be good. Of course before doing anything we need good planning, implementation, and good testing. I want to make sure we don't break anything.

Also, given CompilerSelector.preferred_gcc?, we might run a test PR with the full setup and might need to be able to tweak the through and env variable before merging anything.

Maybe we need to (did not think it completely think it through):

I thought the way it worked was you installed brewed glibc if you have < 2.23 and then everything would be as normal and you can use bottles.

Yeah, that's the theory. I don't know how good this all works. I have seen issues pop up from time to time about users compiling stuff from source / having issues with git / curl / glibc versions. I think a lot has been done regarding this over the last years and the experience is maybe seamless now, but it is always a little bit the unknown for me. Ideally we should have a CI that at least tests that "old OS / old things" scenario from time to time.

Bo98 commented 2 years ago
  • add a security check to stop updates for older glibc versions? Else we might destroy user's installations.
  • drop the wheezy container. Do we need to rebuild the binutils bottle?

I feel these are a separate debate from the rest. Updating to Ubuntu 18.04 (glibc 2.?? and gcc-7) seems to be a different task altogther from updating the absolute minimum of glibc 2.13. That minimum purely affects binutils and portable-ruby and nothing else since the rest should depend on brewed glibc.

Maybe we need to (did not think it completely think it through):

  • update the glibc formula? (there is an audit in brew to tweak for that too)

  • update/change the linux-headers@4.4 formula?

  • downgrade formulae that use brewed gcc to system gcc if possible

Yes, this seems correct.

I have seen issues pop up from time to time about users compiling stuff from source / having issues with git / curl / glibc versions. I think a lot has been done regarding this over the last years and the experience is maybe seamless now, but it is always a little bit the unknown for me.

Looks like the automatic installation of glibc might be broken. The old Linux-specific gcc.rb specified if OS::Linux::Glibc.system_version < Formula["glibc"].version but the new gcc@5.rb has if Formula["glibc"].any_version_installed?.

Apart from that it should work. As far as I know, the rpath of the bottles we build are designed to switch to use brewed glibc if installed (perhaps should double check that still works under newer GCC but I don't see any reason why not).

iMichka commented 2 years ago

Ubuntu 18.04 has glibc 2.27

What you say is (roughly):

I can be happy with that. If we add CI to make sure old versions are working fine. Maybe by installing brew / + a few formulae and running their tests in a wheezy or ubuntu 16.04 container.

Bo98 commented 2 years ago

Yep, sounds good to me.

danielnachun commented 2 years ago

I'm 99% certain glibc, curl, git, and all of their dependencies will be relocatable with https://github.com/Homebrew/brew/pull/12534. Anaconda already ships all those packages as relocatable and I've already tested quite a few of them using the PR in brew with no issues so far. So we won't have to build glibc or binutils on Wheezy anymore once that is the case. I'm not sure how long it will take to get that PR done plus the necessary changes to brew install, brew bottle and brew test-bot, but it will simplify the bootstrapping process considerably when nothing is built from source.

I'd also like to experiment with building a portable-ruby that is statically linked to musl, which could also be built on whatever CI image you want. musl recommends Linux kernel 2.6 or above, but will still work even on older kernels with reduced functionality. This portable-ruby would only be used to bootstrap the install to the point where glibc could be installed, after which we could use a portable-ruby that uses brewed glibc. glibc can run on an older version of the kernel than it was built with (with the same recommended minimum of 2.6), so we would no longer need a minimum glibc at all, and could just spit out a warning if the kernel is older than 2.6 (which is very unlikely).

This later goal is a bit further out but overall these changes should smooth out a lot of issues for supporting older Linux distros in the near(ish) future.

Bo98 commented 2 years ago

That sounds like a good idea (but also shows how we should treat the minimum 2.13 as a separate issue).

For portable-ruby on musl, you'll probably need a patch: https://bugs.ruby-lang.org/issues/14387#note-13.

This portable-ruby would only be used to bootstrap the install to the point where glibc could be installed, after which we could use a portable-ruby that uses brewed glibc.

Is there a reason why we would need to switch to one that uses glibc like this?

danielnachun commented 2 years ago

That sounds like a good idea (but also shows how we should treat the minimum 2.13 as a separate issue).

Yes, and I agree that minimum glibc and kernel support is a separate issue.

Is there a reason why we would need to switch to one that uses glibc like this?

Probably not. That would make things even simpler if we just stick to it.

danielnachun commented 2 years ago

Jump right to Ubuntu 20.04? But some packages might not build with newest gcc, so it's probably good to lag a little bit behind and let other package managers do the heavy lifting.

Moving to Ubuntu 20.04, which uses gcc-9 may not be too crazy actually. All of the errors I've seen from using a version of GCC that was too new were related to C++ in GCC 10, which became stricter about certain rules and requires some additional headers to be explicitly included.

The biggest advantage of moving to gcc-9 is that it has full C++ 17 support which gcc-7 does not. And it even handles a decent amount of C++ 20. I don't think we have anything right now that requires GCC 10 or above (though that will change as C++ 20 is more widely used). Backwards compatibility for C and Fortran is also unlikely to change given how much legacy code is written in those languages.

I'm not adamant about using Ubuntu 20.04 but if I had to guess, we would end up with far fewer packages needing older versions of GCC (which we do package anyway) with gcc-9 than needing a newer GCC with gcc-7.

sjackman commented 2 years ago

Summarizing from above

Build bottles using

Homebrew formulae

Versioning the glibc formula as glibc@2.27 may make the upgrade process easier. The process of upgrading glibc and gcc can be tricky, because it can break essential command line utilities required by brew that depend on ld.so, libc.so.6, and libstdc++.so.6. We've done it once before when migrating from Ubuntu 16.04 to Ubuntu 18.04, so it is possible, but needs testing.

My inclination is to build bottles on Ubuntu 18.04 rather than Ubuntu 20.04, because it reduces the number of client systems that need to install the brewed Glibc. Brewed glibc works pretty darn well, but it's best to use the system Glibc if possible. Testing this upgrade process will likely be the trickiest part.

Host Glibc version

61 formulae mention C++17. Only one formula btop has fails_with gcc: "7".

danielnachun commented 2 years ago

Versioning the glibc formula as glibc@2.27 may make the upgrade process easier. The process of upgrading glibc and gcc can be tricky, because it can break essential command line utilities required by brew that depend on ld.so, libc.so.6, and libstdc++.so.6. We've done it once before when migrating from Ubuntu 16.04 to Ubuntu 18.04, so it is possible, but needs testing.

I agree that versioning is probably a must so that we don't unlink the current glibc keg, which can lead to utter chaos. We could just overwrite the links in $HOMEBREW_PREFIX/lib without ever deleting them - they'd just start pointing to the newer glibc keg. Nothing should break when moving to a newer glibc and libstdc++ since they are backwards compatible.

My inclination is to build bottles on Ubuntu 18.04 rather than Ubuntu 20.04, because it reduces the number of client systems that need to install the brewed Glibc. Brewed glibc works pretty darn well, but it's best to use the system Glibc if possible. Testing this upgrade process will likely be the trickiest part.

I didn't think about that but it's a very good point. Even when glibc is relocatable, there are other quirks that come up when using brewed glibc, especially when mixing software built outside Homebrew with software built within it. So minimizing that is probably preferable to using a newer gcc.

61 formulae mention C++17. Only one formula btop has fails_with gcc: "7".

Unfortunately I know for certain there are other formulae where we did not explicitly add fails_with for any version of GCC other than 5, but which require GCC 8 or 9. Regardless, the list of formulae which will need a newer libstdc++ will still be significantly shorter when using GCC 7 compared to 5, so it will still be a large step up.

Bo98 commented 2 years ago

I agree that versioning is probably a must so that we don't unlink the current glibc keg, which can lead to utter chaos

I'm not sure if I understand the concern here. We don't seem to have any hesitation revision bumping gcc@5?

MikeMcQuaid commented 2 years ago
  • Stay a little bit longer on ubuntu 16.04? Not sure we want this.

Personally, I think we should have dropped it when it stopped being supported. Whatever we move to: we should plan to keep up with supported versions in future and consider picking a distro/version accordingly.

I would like to gather some opinions on what we should aim for first. Which glibc version? Should we go for gcc-6 or straight to gcc-7 with Ubuntu 18.04?

  • Jump right to Ubuntu 20.04? But some packages might not build with newest gcc, so it's probably good to lag a little bit behind and let other package managers do the heavy lifting.

Are 18.04 and 20.04 both LTS? When do they lose support?

  • Use Debian (or something else) for the builds? This might be an opportunity to ship at a different pace / aim for an intermediate glibc version.

Similar question: what versions of Debian are receiving security updates and until when?

From the perspective of "we want something old and stable to build against": Debian makes more sense to me than Ubuntu, personally.

  • Do 2 (or 3) different linux builds: one for the newest ubuntu, one for the oldest? This implies creating new x86_64_linux stanzas for bottle blocks. Does it bring any benefit? I don't know. Maybe potentially more build breakages and even more tweaks. This also implies a second self-hosted runner.

This seems like a lot of work for something that hasn't been requested by users.

  • Drop the gap between the 2 glibc versions we support: I do not really like the fact that we have to build binutils in a separate container, and do not really expect to support users on older systems anyway (we got rid of patchelf though). We could get rid of that idea and make clear that we have only one single minimal glibc version we support.

Yes, I like this idea. If nothing else: let's use the same distro rather than Debian for some stuff and Ubuntu for other stuff.

My inclination is to build bottles on Ubuntu 18.04 rather than Ubuntu 20.04, because it reduces the number of client systems that need to install the brewed Glibc. Brewed glibc works pretty darn well, but it's best to use the system Glibc if possible. Testing this upgrade process will likely be the trickiest part.

Do we have any analytics on this? If not, we should consider gathering them ASAP to get data until we make this change.

sjackman commented 2 years ago

I agree that versioning is probably a must so that we don't unlink the current glibc keg, which can lead to utter chaos. We could just overwrite the links in $HOMEBREW_PREFIX/lib without ever deleting them - they'd just start pointing to the newer glibc keg.

The dynamic linker symlink /home/linuxbrew/.linuxbrew/lib/ld.so in particular can never be removed and has to be overwritten in place. As I recall it's actually okay to unlink glibc briefly during the upgrade. The dynamic linker is still able to find libc.so.6 in the keg, because the path to it is hardcoded in the dynamic linker.

sjackman commented 2 years ago
  • Stay a little bit longer on ubuntu 16.04? Not sure we want this.

Personally, I think we should have dropped it when it stopped being supported. Whatever we move to: we should plan to keep up with supported versions in future and consider picking a distro/version accordingly.

The volunteer maintainers of Homebrew were dealing with the merger of Linuxbrew-core into Homebrew-core and wanted to focus on and deal with one major migration at a time. Now that the merging of cores is complete, we are migrating the distro on which we build bottles.

Are 18.04 and 20.04 both LTS? When do they lose support?

Yes both are LTS, and support for LTS releases is five years. Ubuntu 18.04 will be supported until 2023-04. I would recommend upgrading our bottle build system to Ubuntu 20.04 between 2023-01 and 2023-04.

  • Use Debian (or something else) for the builds? This might be an opportunity to ship at a different pace / aim for an intermediate glibc version.

Similar question: what versions of Debian are receiving security updates and until when?

https://wiki.debian.org/DebianReleases#Production_Releases

Debian lifecycles are not specified as far in advance as Ubuntu. Debian 10 (Buster) is supported until approximately 2022-08. The life cycles of Debian 11 (Bullseye) is not yet specified. Historically Debian's support is roughly three years following release.

https://ubuntu.com/about/release-cycle

Ubuntu LTS releases are every two years in April. Ubuntu's long term support is precisely five years.

From the perspective of "we want something old and stable to build against": Debian makes more sense to me than Ubuntu, personally.

Ubuntu's release cadence and support lifecycle is more predictable and more precisely specified than Debian's and ~two years longer than Debian's.

  • Do 2 (or 3) different linux builds: one for the newest ubuntu, one for the oldest? This implies creating new x86_64_linux stanzas for bottle blocks. Does it bring any benefit? I don't know. Maybe potentially more build breakages and even more tweaks. This also implies a second self-hosted runner.

This seems like a lot of work for something that hasn't been requested by users.

We build one bottle that works on all supported Linux systems, so there's not much need to build multiple bottles.

  • Drop the gap between the 2 glibc versions we support: I do not really like the fact that we have to build binutils in a separate container, and do not really expect to support users on older systems anyway (we got rid of patchelf though). We could get rid of that idea and make clear that we have only one single minimal glibc version we support.

Yes, I like this idea. If nothing else: let's use the same distro rather than Debian for some stuff and Ubuntu for other stuff.

We build binutils on Debian 7 (Wheezy) so that user's can use it to build glibc from source, because the glibc bottle is not relocatable. An alternative solution would be to use @danielnachun's PR https://github.com/Homebrew/brew/pull/12534 Binary patching of build prefixes to relocate the glibc bottle. An interesting option, once this PR is merged, is to use it to build and deploy a relocatable bottle only for glibc by default at first, before this PR rolled out more widely to other formulae.

My inclination is to build bottles on Ubuntu 18.04 rather than Ubuntu 20.04, because it reduces the number of client systems that need to install the brewed Glibc. Brewed glibc works pretty darn well, but it's best to use the system Glibc if possible. Testing this upgrade process will likely be the trickiest part.

Do we have any analytics on this? If not, we should consider gathering them ASAP to get data until we make this change.

Behaviour -> Events -> Top Events

On 2021-12-15 ~16% of eventsΒ (counting only these three OS versions) are on a system whose Glibc version is less than version 2.31, which is the version provided by Ubuntu 20.04 (Focal Fossa).

danielnachun commented 2 years ago

It's worth noting that CentOS 7 is probably still the predominant HPC OS for scientific computing and it is on glibc 2.18 and its end of life is June 30, 2024. I don't want to be too pushy about this but I really agree with @Bo98 that the glibc minimum version question should be a totally separate discussion. I've proposed a set of changes that would allow us to eliminate minimum glibc versions entirely in https://github.com/Homebrew/brew/issues/12565#issuecomment-994090809, and I'd like to try to bring those to fruition before we make any changes regarding that.

We build binutils on Debian 7 (Wheezy) so that user's can use it to build glibc from source, because the glibc bottle is not relocatable. An alternative solution would be to use @danielnachun's PR #12534 Binary patching of build prefixes to relocate the glibc bottle. An interesting option, once this PR is merged, is to use it to build and deploy a relocatable bottle only for glibc by default at first, before this PR rolled out more widely to other formulae.

I'll have to test to be sure, but once that PR is merged, anyone who installs Homebrew on Linux in a prefix that is of equal or shorter length than /home/linuxbrew/.linuxbrew should already be able to install glibc and all other previously non-relocatable bottles. We could then make a relocatable bottle for all prefixes for glibc just by building the bottle in a very long prefix (probably 255 characters to match what Anaconda does) and distribute that (which is what we will do automatically eventually for all non-relocatable formulae). That would let us limit the use of the Wheezy container only to building portable-ruby for now, though my idea about statically linking that to musl would let us eventually drop that too. So I think either we can eventually end up using only one container for everything.

danielnachun commented 2 years ago

I'm not sure if I understand the concern here. We don't seem to have any hesitation revision bumping gcc@5? The dynamic linker symlink /home/linuxbrew/.linuxbrew/lib/ld.so in particular can never be removed and has to be overwritten in place. As I recall it's actually okay to unlink glibc briefly during the upgrade. The dynamic linker is still able to find libc.so.6 in the keg, because the path to it is hardcoded in the dynamic linker.

I'm not sure if I changed this to be this way on my installation but right now $HOMEBREW_PREFIX/ld.so is a symlink into the glibc cellar, so it seems like unlinking is okay after all. We just have to make sure it's not deleted.

Bo98 commented 2 years ago

ld.so already receives special handling in brew and is separate from the usual linking process: https://github.com/Homebrew/brew/blob/9c03493774500cf16ced8938e1eb4eeae8216b20/Library/Homebrew/extend/os/linux/install.rb#L40-L52

This is run every time before a formula is installed.

MikeMcQuaid commented 2 years ago
  • Stay a little bit longer on ubuntu 16.04? Not sure we want this.

Personally, I think we should have dropped it when it stopped being supported. Whatever we move to: we should plan to keep up with supported versions in future and consider picking a distro/version accordingly.

The volunteer maintainers of Homebrew were dealing with the merger of Linuxbrew-core into Homebrew-core and wanted to focus on and deal with one major migration at a time. Now that the merging of cores is complete, we are migrating the distro on which we build bottles.

Sure, to be clear: there's no critique of individuals here, just as a project as a whole we should avoid relying on unsupported software.

I don't think the "volunteer maintainers" language is really needed when all of us in this discussion are "volunteer maintainers" πŸ˜„. I think we should be able to say "Homebrew should do X" in these discussions without it being a passive aggressive critique of anyone. For what it's worth, when I say "we should" or "Homebrew should": I include myself in the group of people who ideally "should" have done this work!

Are 18.04 and 20.04 both LTS? When do they lose support?

Yes both are LTS, and support for LTS releases is five years. Ubuntu 18.04 will be supported until 2023-04. I would recommend upgrading our bottle build system to Ubuntu 20.04 between 2023-01 and 2023-04.

I can see two potential angles here:

  1. we should upgrade straight to 20.04 to avoid having to do a painful migration twice in two years
  2. we should upgrade to 18.04 first to be able to make smaller, incremental changes and create documentation/process/tooling to ease doing so with 20.04 a year later

Either of these seem reasonable. Both have trade-offs.

  • Use Debian (or something else) for the builds? This might be an opportunity to ship at a different pace / aim for an intermediate glibc version.

Similar question: what versions of Debian are receiving security updates and until when?

https://wiki.debian.org/DebianReleases#Production_Releases

Debian lifecycles are not specified as far in advance as Ubuntu. Debian 10 (Buster) is supported until approximately 2022-08. The life cycles of Debian 11 (Bullseye) is not yet specified. Historically Debian's support is roughly three years following release.

https://ubuntu.com/about/release-cycle

Ubuntu LTS releases are every two years in April. Ubuntu's long term support is precisely five years.

From the perspective of "we want something old and stable to build against": Debian makes more sense to me than Ubuntu, personally.

Ubuntu's release cadence and support lifecycle is more predictable and more precisely specified than Debian's and ~two years longer than Debian's.

Again I can see two potential angles here:

  1. we should stay on Ubuntu as it's what we're already using and has a longer support time
  2. we should move to Debian as we're already using it for certain bottles and Ubuntu will already necessitate moving twice in two years so the shorter Debian lifecycle is unlikely to result in more migrations over the next e.g. 5 years. Relatedly, migrating every 3 years rather than every 5 has a greater chance that the tooling/process/documentation/maintainers will not have had as much change in that time

Again: either of these seem reasonable but both have trade-offs.

Do we have any analytics on this? If not, we should consider gathering them ASAP to get data until we make this change.

Behaviour -> Events -> Top Events

πŸŽ‰

On 2021-12-15 ~16% of eventsΒ (counting only these three OS versions) are on a system whose Glibc version is less than version 2.31, which is the version provided by Ubuntu 20.04 (Focal Fossa).

I think that would be a reasonable level of support to drop.

I also think we should be more explicit about our levels of support (like we are with macOS) and warn more in cases where people are having to build from source, due to the disproportionate support burden that results.

I'll have to test to be sure, but once that PR is merged, anyone who installs Homebrew on Linux in a prefix that is of equal or shorter length than /home/linuxbrew/.linuxbrew should already be able to install glibc and all other previously non-relocatable bottles. We could then make a relocatable bottle for all prefixes for glibc just by building the bottle in a very long prefix (probably 255 characters to match what Anaconda does) and distribute that (which is what we will do automatically eventually for all non-relocatable formulae).

This is a nice idea. I'm in favour of using this if it works as expected. I don't think we should block a migration on its success, though.

That would let us limit the use of the Wheezy container only to building portable-ruby for now, though my idea about statically linking that to musl would let us eventually drop that too. So I think either we can eventually end up using only one container for everything.

I'd want that musl support patch to be upstreamed before we go down this route.

Bo98 commented 2 years ago

I'd want that musl support patch to be upstreamed before we go down this route.

It's upstreamed in the sense that it's been sent upstream. But no one has really looked at it. It is however used in the official Docker Ruby image and Alpine Linux.

MikeMcQuaid commented 2 years ago

It's upstreamed in the sense that it's been sent upstream. But no one has really looked at it. It is however used in the official Docker Ruby image and Alpine Linux.

Ok, that raises my confidence in it somewhat. Would still be nice to attempt to poke upstream and see if we have better luck getting it merged.

sjackman commented 2 years ago

I agree that versioning is probably a must so that we don't unlink the current glibc keg, which can lead to utter chaos

I'm not sure if I understand the concern here. We don't seem to have any hesitation revision bumping gcc@5?

It has the potential to be a problem if an executable that brew uses were to depend on libstdc++.so.6. That it's not a problem practically suggests that's not the case.

Bo98 commented 2 years ago

The main concerns for brew is curl and git, though we require a system version anyway in order to install brew and will always prioritise the system version unless HOMEBREW_FORCE_BREWED_{GIT,CURL} is set or under certain conditions were the system one is ancient (though you may have issues installing the brewed git/curl anyway).

tar and whatever is used to extract bottles is a concern. We should probably better control whether we use system or brewed versions of those. It might already prioritise the system though since it's outside of any build environments.

We no longer use a patchelf executable.

When building from source a functional gawk, make, sed is required. If glibc breaking them is a concern, we can change it so system versions are prioritised when building glibc.

One can test by using Debian 7, installing these brewed versions of these tools, force removing glibc and see what breaks.

In general, we've not got too much external dependencies.

danielnachun commented 2 years ago

Just to reiterate here, the only thing we need to do to maintain support for older glibc is to update the glibc package to 2.27, and that can be done in Ubuntu 18.04. I also think as @Bo98 said we really aren't talking about much additional support burden here. This is why I really think the whole discussion about changing the minimum glibc version and using Debian Wheezy is really not very important right now. We're talking about 2 packages we build in Debian 7 - binutils and portable-ruby, neither of which are updated very often, and the glibc package, which needs a bit of special treatment but nothing to too crazy. Aside from the above mentioned updating of glibc, I really think we can just leave this as is as it won't otherwise block the migration.

I will be very transparent here in saying that the reason I'm so outspoken on this issue is because dropping support for older glibc would force me to stop using Homebrew for my main use case, which is to get a better development environment on a CentOS 7 HPC system. I've put in a lot of work to try to improve our support for older Linux systems, and I would hate to throw it away.

I think the biggest issue here just comes from our different backgrounds - I and a few others (@sjackman and @maxim-belkin at least) come from the HPC world, where there is simply no way to get a newer host glibc. However these HPC systems are far from unsupported, and as I mentioned above CentOS 7 is supported through June 2024. I don't think we can compare supporting older versions of glibc to supporting older versions of macOS. Apple is relatively aggressive about dropping support for older OSes, in no small part because of how closely tied they are to the hardware. In the Linux world things are different - the HPC cluster I use has cutting edge hardware, but still runs on a fairly old Linux userspace. I wish the HPC world was less conservative about these things, but unfortunately it's not.

If supporting older Linux distros was not something any active maintainers were interested in doing and it created a lot of additional support burden, I would understand the desire to drop it. However we do have maintainers that are interested in this and I've tried to articulate a plan for how to make that support even more seamless, and contributed PRs to that end. All I'm asking for here is to consider the glibc minimum version a separate issue so we have more time to solve it. The only way it interacts with updating to a newer Ubuntu/Debian version is that we need to update the glibc formula and make sure updates to that don't break existing installs. As soon we've decided what our new target OS will be, I (and hopefully others) will be happy to help with the necessary testing for that. I know @iMichka has shouldered a lot of the burden of handling that in the past but I'm willing to help as much as I can!

sjackman commented 2 years ago

I've been quiet as of late, but I'm happy to help contribute to the migration of the bottling infrastructure from Ubuntu 16.04 to Ubuntu 18.04. I would also like to maintain support for CentOS 7 for the same reasons cited by Daniel. CentOS is the OS used by both my previous employer and my current employer on their HPC systems.

danielnachun commented 2 years ago

Would it be helpful for me to make a separate issue on "Improving support for older Linux systems"? That would help us separate that discussion from this one, so we can focus here on choosing the right distro and any other technicalities we need to deal with on the CI side of things.

MikeMcQuaid commented 2 years ago

I think the biggest issue here just comes from our different backgrounds - I and a few others (@sjackman and @maxim-belkin at least) come from the HPC world, where there is simply no way to get a newer host glibc. However these HPC systems are far from unsupported, and as I mentioned above CentOS 7 is supported through June 2024.

If it helps: I'm not so concerned about the age of an OS but the support status. This begs the question for me: why don't we use CentOS 7 for bottling at least binutils/portable-ruby/glibc and, potentially: everything?

Like we do for portable Ruby on macOS: there's a pretty strong argument for producing your binaries on the oldest system you wish to support e.g. macOS Yosemite for macOS' portable Ruby. If we've got a Linux OS used by multiple maintainers that's supported for another 3.5 years: what are the downsides on using it? I know GitHub Actions doesn't support it natively but Docker makes this fairly moot.

sjackman commented 2 years ago

binutils/portable-ruby/glibc

Minor technicality: glibc can be built on our usual bottling OS because it has no run-time dependencies, specifically no run time dependency on glibc, because it itself is glibc.

and, potentially: everything?

I would prefer to use a distribution of Linux for bottling that has a more consistent release cycle than CentOS 7. We use the version of Glibc and GCC provided by the bottling host OS to build bottles. For that reason, we want the GCC provided by the bottling OS to be reasonably up-to-date, to avoid needing brewed GCC to build most formulae.

This begs the question for me: why don't we use CentOS 7 for bottling at least binutils/portable-ruby

Debian 7 provides Glibc 2.13 Ubuntu 12.04 provides Glibc 2.15 CentOS 7 provided Glibc 2.17

Debian 7 was chosen because executables compiled on it can run on CentOS 7, and because it's a similar distribution to Ubuntu, it allows us to share a substantial amount of the Dockerfile configuration between it and Ubuntu. I'm not strongly motivated to change it so long as it continues to work, but I'm not against changing the bottling of portable-ruby and binutils to CentOS 7 if someone wanted to take that on.

sjackman commented 2 years ago

One option perhaps worth giving a go is creating a keg-only formula for glibc@2.13 and making it a build dependency when building bottles for portable-ruby and binutils. I'm sure it'd need a bit of experimentation to make that work as intended, but it sounds plausible, and we could then drop the older bottling image.

MikeMcQuaid commented 2 years ago

I would prefer to use a distribution of Linux for bottling that has a more consistent release cycle than CentOS 7.

Can you elaborate on this?

For that reason, we want the GCC provided by the bottling OS to be reasonably up-to-date, to avoid needing brewed GCC to build most formulae.

Similarly, can you elaborate on this? Are we not generally optimising for the bottled rather than build from source case? Why is the brewed GCC a problem?

Debian 7 provides Glibc 2.13

This is unsupported today, no?

Ubuntu 12.04 provides Glibc 2.15

This will be unsupported next year.

I'm not strongly motivated to change it so long as it continues to work, but I'm not against changing the bottling of portable-ruby and binutils to CentOS 7 if someone wanted to take that on.

I'm not 😁. I'm more coming from the position that if we have multiple people saying "please let's support CentOS 7" it seems like it'd be a good fit at least to build our oldest stuff on if not do our actual bottling on.

One option perhaps worth giving a go is creating a keg-only formula for glibc@2.13 and making it a build dependency when building bottles for portable-ruby and binutils. I'm sure it'd need a bit of experimentation to make that work as intended, but it sounds plausible, and we could then drop the older bottling image.

Would need to be statically linked for portable-ruby which will add some complexity here. Also, this would need to be two formulae as a result. I'm not sure this is worth the effort (unless you're volunteering to do so).

MikeMcQuaid commented 2 years ago

Debian 7 was chosen because executables compiled on it can run on CentOS 7, and because it's a similar distribution to Ubuntu, it allows us to share a substantial amount of the Dockerfile configuration between it and Ubuntu.

And, just to clarify: what would the work involved be here beyond "make sure the Dockerfile works as expected"?

iMichka commented 2 years ago

Let's try to summarise (it's hard because a lot of things have been discussed above).

We can make a small poll if you want but if no-one has strong opinions against these first few decisions, that would give us a first direction.

What I would really like is CI that test's that nothing gets broken (before, during and after the migration). Could be in a separate repo if necessary).

sjackman commented 2 years ago

I would prefer to use a distribution of Linux for bottling that has a more consistent release cycle than CentOS 7.

Can you elaborate on this?

CentOS 7 was released in 2014-07, and then there was a five year gap until CentOS 8 was released in 2019-09. https://en.wikipedia.org/wiki/CentOS#End-of-support_schedule

Ubuntu LTS is released every two years in April and is supported for five years. Ubuntu 22.04 (Jammy Jellyfish) for example will be released in 2022-04 and be supported until 2027-04. https://en.wikipedia.org/wiki/Ubuntu#Releases

For that reason, we want the GCC provided by the bottling OS to be reasonably up-to-date, to avoid needing brewed GCC to build most formulae.

Similarly, can you elaborate on this? Are we not generally optimising for the bottled rather than build from source case? Why is the brewed GCC a problem?

Homebrew uses the GCC (/usr/bin/gcc) provided by our bottling Docker image to build bottles. We do optimize the user experience to use bottles rather than building from source, but I'm talking about our bottling infrastructure here.

Debian 7 provides Glibc 2.13

This is unsupported today, no?

Yes, Debian 7 (Wheezy) is EOL on 2018-06-01. https://en.wikipedia.org/wiki/Debian_version_history#Debian_7_(Wheezy)

Ubuntu 12.04 provides Glibc 2.15

This will be unsupported next year.

Ubuntu 12.04 LTS (Precise Pangolin) is EOL on 2017-04. Ubuntu LTS is supported for five years.

One option perhaps worth giving a go is creating a keg-only formula for glibc@2.13 and making it a build dependency when building bottles for portable-ruby and binutils. I'm sure it'd need a bit of experimentation to make that work as intended, but it sounds plausible, and we could then drop the older bottling image.

Would need to be statically linked for portable-ruby which will add some complexity here. Also, this would need to be two formulae as a result.

portable-ruby is dynamically linked to glibc. Static linking to glibc is not generally recommended. https://stackoverflow.com/questions/57476533/why-is-statically-linking-glibc-discouraged

We statically link to all of portable-ruby's dependencies except glibc.

$ docker run -it --rm ghcr.io/homebrew/brew:3.3.9 readelf -d /home/linuxbrew/.linuxbrew/Homebrew/Library/Homebrew/vendor/portable-ruby/2.6.8/bin/ruby | grep NEEDED
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libcrypt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libutil.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [ld-linux-x86-64.so.2]

I'm not sure this is worth the effort (unless you're volunteering to do so).

It would allow us to stop using/maintaining the homebrew/debian7 Docker image. I'll take a stab at it over the break, and at least find out how easy/hard it may be.

sjackman commented 2 years ago

Debian 7 was chosen because executables compiled on it can run on CentOS 7, and because it's a similar distribution to Ubuntu, it allows us to share a substantial amount of the Dockerfile configuration between it and Ubuntu.

And, just to clarify: what would the work involved be here beyond "make sure the Dockerfile works as expected"?

https://github.com/Homebrew/homebrew-linux-dev/blob/master/Dockerfile

iMichka commented 2 years ago

And also: take a few plain docker images (old debian, old centos, ubuntu 16, 18 and 20), let CI install homebrew install itself on it, install a few formulae and run the tests. All these steps should always work. Maybe a nightly build would do the job, or something that is tested on each Homebrew/brew PR.

MikeMcQuaid commented 2 years ago

Homebrew uses the GCC (/usr/bin/gcc) provided by our bottling Docker image to build bottles. We do optimize the user experience to use bottles rather than building from source, but I'm talking about our bottling infrastructure here.

@sjackman Gotcha so the issue here is more the libstdc++ and other libraries rather than actual compiler used, if I've understood correctly? We could potentially consider pulling some of these into a similar formulae to glibc that could be pretty small.

CC @fxcoudert for GCC knowledge help 😁

We statically link to all of portable-ruby's dependencies except glibc.

Gotcha, didn't know this, TIL! Yeh, the keg-only glibc seems like a good call in this case πŸ‘πŸ»

It would allow us to stop using/maintaining the homebrew/debian7 Docker image. I'll take a stab at it over the break, and at least find out how easy/hard it may be.

Yeh, sounds great, I'd love to see it go away.

  • Make a version of this Dockerfile that's based on the Docker image centos:7
  • Test that image by using it to build bottles for portable-ruby and binutils, and then test those bottles by using them to build say gcc from source

πŸ‘πŸ» thanks. Does CentOS have any concept of backports we could use/that are typically installed on these sort of HPC clusters?

And also: take a few plain docker images (old debian, old centos, ubuntu 16, 18 and 20), let CI install homebrew install itself on it, install a few formulae and run the tests. All these steps should always work. Maybe a nightly build would do the job, or something that is tested on each Homebrew/brew PR.

Nightly sounds good. Should migrate these old Docker images from Linuxbrew over to here for the OSs we want to make sure work.

carlocab commented 2 years ago

Gotcha so the issue here is more the libstdc++ and other libraries rather than actual compiler used, if I've understood correctly? We could potentially consider pulling some of these into a similar formulae to glibc that could be pretty small.

Multiple package managers have a separate package for the GCC runtime libraries. We could try doing something similar, but we might be a bit limited by there being a one-to-one map between a build and the formula that drives the build. (i.e. I think it would've been useful for this to have a build that runs once but actually produces bottles for two formulae)

sjackman commented 2 years ago

Homebrew uses the GCC (/usr/bin/gcc) provided by our bottling Docker image to build bottles. We do optimize the user experience to use bottles rather than building from source, but I'm talking about our bottling infrastructure here.

Gotcha so the issue here is more the libstdc++ and other libraries rather than actual compiler used, if I've understood correctly? We could potentially consider pulling some of these into a similar formulae to glibc that could be pretty small.

Both. For the bottling infrastructure, we use /usr/bin/gcc to build formulae, and so we want that compiler to be sufficiently recent that most formulae are buildable with it. Formulae that aren't buildable with the bottling infrastructure /usr/bin/gcc have a build dependency on brewed gcc. The resulting bottle may have a run-time dependency on brewed gcc for libstdc++.so.6 or other shared libraries provided by gcc if needed.

Does CentOS have any concept of backports we could use/that are typically installed on these sort of HPC clusters?

This looks like it? I'm not familiar with softwarecollections.org. https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/

$ docker run --rm centos/devtoolset-7-toolchain-centos7 gcc --version
gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
$ docker run --rm centos/devtoolset-7-toolchain-centos7 ldd --version
ldd (GNU libc) 2.17
MikeMcQuaid commented 2 years ago

Thanks @carlocab and @sjackman.

This looks like it? I'm not familiar with softwarecollections.org. https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/

I wonder if something like this would enable anything or solve any problems?

@sjackman @danielnachun in the HPC environments you work in is something like this:

sjackman commented 2 years ago

in the HPC environments you work in is something like this: already installed?

Surprisingly, yes. I've never heard of it. I'm not super familiar with administering CentOS 7 systems though. It has no packages installed.

$ which scl
/bin/scl
$ scl --list

SCL does require sudo privilege to install packages.

sudo yum install devtoolset-7 # admin
scl enable devtoolset-7 bash # user

https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7

something you could get an admin to install for you?

At my workplace, likely yes. Larger (government/academic) shared HPC sites, probably not. Each HPC site has its own "solution" for installing software packages.

danielnachun commented 2 years ago

Unfortunately for HPC systems, we have to assume that 1) only the bare essentials of the host OS are installed and 2) the HPC admins will be unresponsive to installation or configuration requests. The HPC system I currently use has excellent, responsive staff but I've interacted with other systems which did not. That's the real appeal of Homebrew in those situations - you are the sysadmin and you have total control over your installation. Homebrew also provides a consistent, uniform experience across the severely fragmented HPC space, which other tools like Spack, Anaconda, Lmod, etc. cannot do.

With some assistance from @sjackman, I've gotten glibc@2.27 to build from source with the host toolchain in a CentOS 7 VM. I'll be able to open a PR for that once we've got a Ubuntu 18.04 runner ready.

As for splitting up GCC into a separate compiler and library, this could work but I suspect that most of the cases where the user has an old GCC are also ones where they do not have admin access and are thus in non-default prefixes. Until we have full binary relocation, any non-relocatable packages which need a newer libstdc++ will also need the newer compiler to build them from source. So I'm not sure we'd save users much space unless they only happen to install packages which don't use C++ or are relocatable.

MikeMcQuaid commented 2 years ago

Homebrew also provides a consistent, uniform experience across the severely fragmented HPC space, which other tools like Spack, Anaconda, Lmod, etc. cannot do.

@danielnachun for my own curiosity: why can't they?

As for splitting up GCC into a separate compiler and library, this could work but I suspect that most of the cases where the user has an old GCC are also ones where they do not have admin access and are thus in non-default prefixes. Until we have full binary relocation, any non-relocatable packages which need a newer libstdc++ will also need the newer compiler to build them from source. So I'm not sure we'd save users much space unless they only happen to install packages which don't use C++ or are relocatable.

Good to know πŸ‘πŸ».

Given everything above I agree that bottling on CentOS 7 seems to be a non-starter for now. I do like the idea of us being able to use a fairly new Ubuntu version once we have relocation working better; pushing more users onto a non-system GCC seems like not much of an issue once we have relocation working for non-default prefixes.

sjackman commented 2 years ago

I've opened this PR to build binutils using the new formula glibc@2.13, which would allow us to drop the Debian 7 (Wheezy) Docker image homebrew/debian7.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.