Homebrew / homebrew-core

🍻 Default formulae for the missing package manager for macOS (or Linux)
https://brew.sh
BSD 2-Clause "Simplified" License
13.58k stars 12.33k forks source link

Remove Linux-only GCC dependencies #110010

Closed danielnachun closed 1 year ago

danielnachun commented 2 years ago

Now that we have migrated to a newer GCC in CI, we should no longer need any Linux-only dependencies on GCC due to needing to support newer C++ standards.

We should discuss the best strategy for doing this that avoids breaking existing or new installations. Simply removing the depend_on "gcc" line shouldn't cause any problems for existing installations because we always add the RPATH for brewed GCC when pouring bottles. However, if the bottle has been built with GCC 12, it will be broken on new installations because we only require GCC 11 either from the host or the gcc@11 formula. So anything built with GCC 12 does require a new bottle regardless of how we end up making it.

Existing PR

carlocab commented 2 years ago

I suggest leaving formulae that need a CI-long-timeout label last, to maximise the chances that a version/revision bump is made in the intervening period that will remove the gcc dependency anyway. CI is too slammed to have more than 5 of these PRs going at a time.

danielnachun commented 2 years ago

Yeah I was definitely planning on leaving CI-long-timeout for the absolute last. I've tried to check previous runs for formulae to see if they needed CI-long-timeout and skip any that did require it. If I miss one that does end up needing the long build label I will probably close it for now and reopen when the rest these are done.

I was hoping to submit the regular PRs that don't need CI-long-timeout in batches. What's a reasonable number to submit at once? I don't want to create a backlog in the queue when it's busy, although it seems to get very quiet during certain hours.

I'm also trying to avoid having too many open altogether even if they are finished because I don't want to accumulate too many failures. At the moment I will push a few more but I want to get the backlog merged before opening others.

danielnachun commented 2 years ago

In case anyone is wondering why I've been deleting links to PRs after they've been merged, the comment with the checklist starts to have trouble updating if there are too many links so I have to periodically delete them.

carlocab commented 1 year ago

Given that Ubuntu 22.04 has been using GCC 12 runtime libraries all this time: maybe we don't even need to publish new bottles to remove the Linux-only gcc dependency after all.

In particular, bottles built with our GCC 12 should work without brew install gcc on Ubuntu 22.04, and we'll be installing gcc automatically on systems where the host runtime libraries are too old.

MikeMcQuaid commented 1 year ago

In particular, bottles built with our GCC 12 should work without brew install gcc on Ubuntu 22.04, and we'll be installing gcc automatically on systems where the host runtime libraries are too old.

Will we install GCC 13 when we upgrade to it? If so/not, should we?

carlocab commented 1 year ago

Will we install GCC 13 when we upgrade to it? If so/not, should we?

I think we should for systems with GCC runtime libraries that are older than Ubuntu 22.04. Pinning our preferred GCC to GCC 12 obliges us to merge a gcc@12 formula simultaneously with an upgrade to GCC 13.

That's doable, but GCC upgrades are hard enough; let's not make them harder.

Upgrading gcc to a new major version alongside creating a new versioned formula for what the gcc formula used to be also tends to mask issues with dependents for the new gcc version.

MikeMcQuaid commented 1 year ago

I think we should for systems with GCC runtime libraries that are older than Ubuntu 22.04.

I think we should maintain the same versions whether they are from Homebrew or the system rather than having older systems have newer compilers.

carlocab commented 1 year ago

Note that it's important to distinguish between the compiler and the runtime libraries, because they need not be the same version. They are not on Ubuntu 22.04, which is why we needed Homebrew/brew#13882.

I think we should maintain the same versions whether they are from Homebrew or the system rather than having older systems have newer compilers.

Ok, but what about systems whose runtime libraries are newer? Should we use the same versions as well? This means installing runtime libraries that are older than what is provided by the system, and that's something we haven't been doing historically.

MikeMcQuaid commented 1 year ago

Note that it's important to distinguish between the compiler and the runtime libraries, because they need not be the same version. They are not on Ubuntu 22.04, which is why we needed Homebrew/brew#13882.

@carlocab Yeh, we need to distinguish between these but I think it'd be much simpler if we didn't and picked the same GCC version for both.

Ok, but what about systems whose runtime libraries are newer? Should we use the same versions as well?

Depends how much we trust the backwards compatibility. If the answer is "a lot/perfectly" we don't need to do this I guess.

carlocab commented 1 year ago

@carlocab Yeh, we need to distinguish between these but I think it'd be much simpler if we didn't and picked the same GCC version for both.

Yes, and this, to me, is one of the reasons why we should just use the gcc formula without pinning the version. It's a lot simpler.

Depends how much we trust the backwards compatibility. If the answer is "a lot/perfectly" we don't need to do this I guess.

Ok, suppose we trust it a lot/perfectly. Then, I don't see why we shouldn't also use the runtime libraries provided by the unversioned gcc whenever the host libraries are older than the ones our bottles are built against.

If the concern is that gcc will one day become gcc@13, then this concern is the same as the one for systems where the runtime libraries are newer, and we've decided that we trust backward compatibility enough to not consider this an issue.

Conversely, if we don't trust backward compatibility enough to be okay with gcc being upgraded to gcc@13, then we also shouldn't allow our bottles to be run against host runtime libraries that are newer than ones we have in CI.

That said, Linuxbrew has worked for a long time on systems whose runtime libraries are newer than the ones our bottles are built against with very few issues, so I suspect we shouldn't be worried about using libraries that are too new, at least as far as GCC is concerned.

MikeMcQuaid commented 1 year ago

@carlocab Yeh, we need to distinguish between these but I think it'd be much simpler if we didn't and picked the same GCC version for both.

Yes, and this, to me, is one of the reasons why we should just use the gcc formula without pinning the version. It's a lot simpler.

It's simpler but it's less robust or consistent.

Ok, suppose we trust it a lot/perfectly. Then, I don't see why we shouldn't also use the runtime libraries provided by the unversioned gcc whenever the host libraries are older than the ones our bottles are built against.

Yes, runtime libraries would seem reasonable if we trust it perfectly.

I may trust the runtime libraries perfectly if @fxcoudert agrees.

I definitely do not trust a GCC 13.0.0 release to happily compile everything in Homebrew or expect upstream projects to jump to fix reported issues there. Backwards compatibility seems reduced in the compiler than in the runtime libraries, for sure.

That said, Linuxbrew has worked for a long time on systems whose runtime libraries are newer than the ones our bottles are built against with very few issues, so I suspect we shouldn't be worried about using libraries that are too new, at least as far as GCC is concerned.

If we had a separate formula for runtime libraries and the compiler: I'd agree. We don't, though and it seems better to not have to install two different formulae for runtime and build time.

carlocab commented 1 year ago

I definitely do not trust a GCC 13.0.0 release to happily compile everything in Homebrew or expect upstream projects to jump to fix reported issues there.

Yep, neither do I. However, we're already carrying around two gcc@* versions in brew. I'm suggesting we reduce that to only 1 by having a preferred compiler version for building stuff, and using the unversioned one to provide the runtime.

This way all users who only use bottles only ever needs to install one gcc version. This is was previously not the case on Ubuntu 20.04, where you had gcc@11 installed as a global dependency, but also perhaps gcc because a formula depends on it (e.g. openblas).

fxcoudert commented 1 year ago

Yes, the runtime libraries are always backward-compatible (a bug is always possible, but I do not believe I have seen one in many years now, and bugs can be fixed).

The compiler is generally stable over time, but its set of default warnings (turned into errors when projects use -Werror) and default standard can evolve over time, which is usually what projects get hit by.

MikeMcQuaid commented 1 year ago

Yep, neither do I. However, we're already carrying around two gcc@* versions in brew. I'm suggesting we reduce that to only 1 by having a preferred compiler version for building stuff, and using the unversioned one to provide the runtime.

I disagree. People seem to use the gcc@ formulae: https://formulae.brew.sh/analytics-linux/install/90d/

The compiler is generally stable over time, but its set of default warnings (turned into errors when projects use -Werror) and default standard can evolve over time, which is usually what projects get hit by.

Yeh, this is what I'm concerned about.

Given this I'm happy with:

carlocab commented 1 year ago

I disagree. People seem to use the gcc@ formulae: https://formulae.brew.sh/analytics-linux/install/90d/

These analytics are misleading. What I think is telling are (a) installs-on-request vs installs and (b) the ratio of installs in the past 30 days vs the past 365 days.

For gcc@11, in the past 30 days, we have ~500 installs-on-request vs ~25000 installs. Incidentally, ~25000 is also about the number of installs in the past 365 days.

What this tells us is that nearly all Linux users who installed gcc@11 did so recently, and did it as a dependency of another formula. This is extremely suggestive that Linux users who installed gcc@11 on Linux did so because we added it to their global dep tree.

Given this I'm happy with:

  • always use the same or newer major GCC for runtime
  • always use the same major version of GCC for compilation

Great; I think we are in agreement.

fxcoudert commented 1 year ago

The analytics for gcc@11 I think are misleading. I do not think it was installed as part of any dependency tree (because we do not have it as a dependency anywhere), but rather that it was the main gcc version for a long time. Is that possible? Otherwise, I cannot explain those numbers.

There should be no reason that gcc@N is ever used as a dependency. In fact, there is only one formula in all of homebrew-core that has a specific gcc version, and only on linux (envoy requires gcc@9, but there is an upstream patch for this available since June 2021: https://github.com/google/brotli/pull/893).

I am advocating keeping a number of older compilers around. I've argued in the past for it: compilers play a central role in the development process, and are key to user workflows. I think there is relatively little cost for us (and a lot of benefit for users) in keeping around a few versions of GCC. I think the same is true of other languages (python, ruby, go, openjdk) and databases (mariadb, postgresql). To me, those are really good exceptions to our general "reduce the number of versioned formulas" policy.

carlocab commented 1 year ago

I do not think it was installed as part of any dependency tree (because we do not have it as a dependency anywhere)

Oh, actually, for a few weeks gcc@11 was a dependency of everything in homebrew/core for users whose /usr/bin/gcc is older than GCC 11. (Now gcc@11 has been replaced with gcc@12. It turns out Ubuntu 22.04 uses the runtime libraries of GCC 12 but /usr/bin/gcc is GCC 11. See #110917.) The set of users this applies to (i.e. /usr/bin/gcc is older than GCC 11) is a significant fraction of our Linux user base.

MikeMcQuaid commented 1 year ago

There should be no reason that gcc@N is ever used as a dependency.

Except if we want to pin a certain version to avoid the e.g. warning-related issues you mentioned above.

I am advocating keeping a number of older compilers around. I've argued in the past for it: compilers play a central role in the development process, and are key to user workflows. I think there is relatively little cost for us (and a lot of benefit for users) in keeping around a few versions of GCC. I think the same is true of other languages (python, ruby, go, openjdk) and databases (mariadb, postgresql). To me, those are really good exceptions to our general "reduce the number of versioned formulas" policy.

Strongly agree.

danielnachun commented 1 year ago

This is done. Thank you to everyone who helped with this huge undertaking!