Homebrew / brew

🍺 The missing package manager for macOS (or Linux)
https://brew.sh
BSD 2-Clause "Simplified" License
40.74k stars 9.55k forks source link

Multiple cellars in a single Homebrew installation (i.e. Homebrew "virtual environments") #11787

Closed carlocab closed 2 years ago

carlocab commented 3 years ago

Provide a detailed description of the proposed feature

Currently, Homebrew installs everything into $HOMEBREW_PREFIX/Cellar. I believe it would be useful to be able to use brew to:

  1. Create additional cellars; and,
  2. Switch between these cellars when using/installing formulae.

The aim is to make this suggestion of mine a bit easier to do with a single Homebrew installation, while possibly also simultaneously avoiding having to build anything from source.

What is the motivation for the feature?

A common pain point that users often express is that brew [install,upgrade] <formula> upgrades many other seemingly unrelated formulae. [1, 2, 3, 4, etc]

Using multiple cellars could function a lot like virtual environments for Homebrew that allow you to isolate different formulae installations from each other. In particular, it would allow you to guarantee that installing something will not touch anything else you have installed.

[1] https://twitter.com/brendt_gd/status/1409404353623085058?s=21 [2] #11778 [3] https://stackoverflow.com/questions/4523920/how-do-i-update-a-formula-with-homebrew/7898617#comment111424255_7898617 [4] https://twitter.com/richardf/status/1418900413760380934?s=21

How will the feature be relevant to at least 90% of Homebrew users?

It may not be, but it might help us address a complaint that comes up pretty often.

What alternatives to the feature have been considered?

Doing nothing, or providing better documentation for maintaining multiple Homebrew installations as a way of rolling your own virtual environments.

MikeMcQuaid commented 3 years ago

This is a nice idea. Unfortunately for non-relocatable bottles to work they need to have a matching prefix and cellar for some software. We could potentially have bottles with relocatable cellars/prefixes/repos independently but that work would need to be done and the existing bottle metadata detected/tracked before we could implement something like this 😭.

It's a shame because I also think this would be a great feature to have. I've thought in the past about having brew bundle handle this on a per-project basis but it's requires more/all cellar :any bottles to be useful.

carlocab commented 3 years ago

I was thinking about the non-relocatable bottles issue and considered it a potential blocker. #10846 could be useful here.

Here's one way around it: a user's active cellar is always named $HOMEBREW_PREFIX/Cellar. I believe this should be enough to guarantee that we can always pour bottles into any virtual env (assuming a default prefix). The drawback is that there's no telling what happens when you try to use something in an inactive cellar. (Compare this to a Python venv, which you don't even need to activate to use.) Another potential drawback is that we'd be leaning heavily on file system operations to switch virtual environments, and I can see how this might be a bit fragile.

MikeMcQuaid commented 3 years ago

I was thinking about the non-relocatable bottles issue and considered it a potential blocker. #10846 could be useful here.

Agreed although I'm personally pessimistic that this is going to get us to 100% relocatable coverage in the medium term.

Here's one way around it: a user's active cellar is always named $HOMEBREW_PREFIX/Cellar. I believe this should be enough to guarantee that we can always pour bottles into any virtual env (assuming a default prefix). The drawback is that there's no telling what happens when you try to use something in an inactive cellar. (Compare this to a Python venv, which you don't even need to activate to use.) Another potential drawback is that we'd be leaning heavily on file system operations to switch virtual environments, and I can see how this might be a bit fragile.

This seems more workable but, yes, potentially "dangerous".

Another related solution I've thought of is advertising HOMEBREW_NO_AUTO_UPDATE, HOMEBREW_NO_INSTALL_CLEANUP, HOMEBREW_NO_INSTALL_UPGRADE, HOMEBREW_NO_INSTALLED_DEPENDENTS_CHECK, etc. a bit more prominently so people can use them (at their own risk) if that's how they prefer to use Homebrew.

danielnachun commented 3 years ago

I was thinking about the non-relocatable bottles issue and considered it a potential blocker. #10846 could be useful here. Agreed although I'm personally pessimistic that this is going to get us to 100% relocatable coverage in the medium term.

I've now got a few examples of corner cases that will require some creativity to deal with, so I agree that 100% will take a while to get to.

Here's one way around it: a user's active cellar is always named $HOMEBREW_PREFIX/Cellar. I believe this should be enough to guarantee that we can always pour bottles into any virtual env (assuming a default prefix). The drawback is that there's no telling what happens when you try to use something in an inactive cellar. (Compare this to a Python venv, which you don't even need to activate to use.) Another potential drawback is that we'd be leaning heavily on file system operations to switch virtual environments, and I can see how this might be a bit fragile.

This is the model Anaconda uses, and it works quite well from what I've seen. They have a pkgs folder equivalent to the cellar where all packages are installed (and it is this prefix that is replaced by binary patching for non-relocatable binaries). The environments are all stored as subfolders in an envs folder, with the appropriate files for each environment being hard-linked into the corresponding folder. Where they fall short in my opinion is that their dependency resolution becomes intractable with too many package because dependencies are always versioned.

Assuming we continue to use the same cellar as before, there are at least two issues I can foresee that are fixable (albeit with a lot of work):

1) Currently we only add formula-specific RPATHs for keg-only dependencies, and then add the single RPATH $HOMEBREW_PREFIX/lib to locate libraries for all dependencies which are not keg-only. This would lead to potential breakage if $HOMEBREW_PREFIX/lib contains an incompatible shared library with the same name as the older version needed by the virtual environment.

Anaconda always uses package-specific RPATHS to avoid this, and we've discussed before wanting to do the same here because it would also, for example, prevent users from breaking formula by using brew unlink. The main problem here is that it essentially requires rebottling everything, at least on Linux. It's probably less of an issue on macOS where RPATHs are less important.

2) We would probably need to make more aggressive use of shebang rewrites and env scripts for interpreted/byte compiled languages. Specifying the exact path to the interpreter and library locations even when installed in the default prefix would make it trivial to replace that path with one to a virtual env instead.

These are just some preliminary thoughts that have been bouncing around in my head for a while. I've had a lot of people tell me that they just wish they could use Homebrew for managing all their software, with a "base" environment with the newest version of things, and virtual environments when they need exact versions, instead of having to use Anaconda or containers for this. There's plenty to criticize about how Homebrew is designed, but the fact that it doesn't require sudo access, namespaces/Proot, or virtualization to provide binary packages (with some exceptions when in a non-default prefix, for now) makes it a lot more flexible and easy to use.

jafd commented 3 years ago

This is a nice idea. Unfortunately for non-relocatable bottles to work they need to have a matching prefix and cellar for some software. We could potentially have bottles with relocatable cellars/prefixes/repos independently but that work would need to be done and the existing bottle metadata detected/tracked before we could implement something like this 😭.

As someone who has to solve this problem right now (and I solved it by putting several spokes into Homebrew itself, perfect is the enemy of good, etc):

My particular use case is "make a distribution of an internal-use software collection that uses Homebrew to manage itself and doesn't interfere with a Homebrew installation the user may already have".

So far I've been under the impression that the Homebrew's direction is going firmly towards a hardcoded One True Way, and overcoming any of the problems I've mentioned is going to be an increasingly uphill battle. Please correct me if I'm wrong.

carlocab commented 3 years ago

I'd risk to estimate that more than half of formulae would transitively pull one of those either to build themselves or at runtime.

You're very much correct:

❯ cat <(brew uses --include-build --recursive pkg-config) <(brew uses --include-build --recursive openssl@1.1) | sort | uniq | wc -l
3046
  • I was unable, when building bottles in a custom prefix, to make the custom prefix propagate inside the bottle. When trying to install the same bottle, Homebrew told me it was made for /usr/local. Maybe there's a bug. Maybe it's a feature, I have no way of knowing, as this part is not sufficiently documented.

This doesn't sound right. Though we only ever build in default prefixes, so I guess it doesn't surprise me that some of that is baked into brew somewhere. Maybe the build prefix isn't being recorded in the tab?


Here's a quick-and-dirty way to implement this (that I have not tested, caveat emptor): put your Cellar under version control. Then, your different branches are your virtual environments.

You'll need to do a brew cleanup and brew link whenever you switch branches, but I don't think scripting this should be too hard.

You'll probably also want to keep your Git history short, and do git gc early, and often.

jafd commented 3 years ago

Maybe the build prefix isn't being recorded in the tab?

You mean the JSON file that a bottle is made from in the brew bottle command? It pretty much is. I even tried putting it into the JSON manually in between brew build-bottle and brew bottle, but even then it was nowhere to be found in the resulting tarball.

The bottle tarballs don't even mention the prefix in any of the extra files, so it looks as if the whole machinery is omitted. Upon BottleSpecification initialization, the only ever value assigned to the prefix is the default prefix. Or maybe I've been looking in the wrong places.

(My own hunch is that prefix, in fact, should be the parent directory of whatever directory the brew command is in, and shouldn't really be significant for any operation; the cellar path is more important anyway, and since one is usually derived from the other, I'm wondering why there are requirements for specific values of both of them.)

jafd commented 3 years ago

Here's a quick-and-dirty way to implement this (that I have not tested, caveat emptor): put your Cellar under version control. Then, your different branches are your virtual environments.

But then you cannot be using both at the same time. Python's virtual environments are that good precisely because you can use many of them at the same time.

carlocab commented 3 years ago

But then you cannot be using both at the same time.

I've already acknowledged difficulties with using both at the same time in my previous comments. There's almost certainly a hard trade-off between using bottles and being able to use multiple isolated virtual environments at the same time. If you want one, then you give up the other.

jafd commented 3 years ago

There is one caveat: I'd like it to work if I don't mind building the bottles I want (and then installing the result on 10 machines, as opposed to building the same things 10 times). Let's start small.

Right now, I have to vendor in most of the formulae for my stuff, and correct dependencies so everything depending on pkg-config is really dependent on my/stuff/pkg-config, and everything that depended on that, recursively, totalling at 47 formulae of which I really cared about maybe 10–12, and of which there are 5 in-house ones. This can easily take upwards of a day the first day, and then a couple hours each time I update those formulae. I think it can be done better.

True, the resulting bottles won't be portable within the filesystem tree. But if they carry the correct architecture, prefix and cellar in their specification, it's not going to be a problem.

danielnachun commented 3 years ago

These points are why I think using multiple cellars is not the best approach. There is nothing preventing multiple versions of the same formula from existing in the same cellar, because the exact version and revision is in the prefix. For a given formula, the real prefix is $HOMEBREW_PREFIX/Cellar/${full_version}, and there is no reason you cannot also have $HOMEBREW_PREFIX/Cellar/${older_version} installed along side. What matters is what ends up in PATH.

Currently we link any formula which is not keg-only into $HOMEBREW_PREFIX. You cannot symlink two different versions of the same formula into $HOMEBREW_PREFIX if they have conflicting file names, but you could link your older version into $HOMEBREW_PREFIX/envs/my_env1, and this will have no effect on the user's existing installation, unless they add $HOMEBREW_PREFIX/envs/my_env1/bin to their PATH.

In your particular case @jafd, you may not even need a dedicated environment if all you need is some versioned dependencies for your in-house formulae. You may just be able to use brew extract to extract the versions of our formulae that you need for your in house formulae, and then make them keg-only (this will automatically add the right PKG_CONFIG_PATH for you). Then you could make your in-house formulae depend on those instead of the newest version of the formulae, and link it into the regular `$HOMEBREW_PREFIX.

The only limitations to this are that you would need to prevent Homebrew from automatically deleting the cellar installations of the older versions of the formulae (there may be an environment variable that already does this), and if your in-house formulae use RPATH, you'd probably have to change https://github.com/Homebrew/brew/blob/02756cfe455c2ac80db1c646493eccd4d7c3b967/Library/Homebrew/extend/ENV/super.rb#L218 to paths << deps.map(&:opt_lib) so that they don't try to use shared libraries from the newer, incompatible versions of dependencies.

The "one cellar, many environments/prefixes" model already works quite well with Anaconda, and it requires surprisingly few changes I think to work with Homebrew.

carlocab commented 3 years ago

brew used to have better support for this -- back when we had things like brew switch. It never worked properly though and led to things being broken a lot. I think this is largely because, as you point out, dependency resolution is hard. This means that doing this probably requires a lot more changes to brew to get it to work, which I don't think are quite feasible (or else we'd have just fixed brew switch rather than removed it).

danielnachun commented 3 years ago

We've discussed getting rid of $HOMEBREW_PREFIX/lib from RPATH before because that would also mean that brew unlink wouldn't break formula, which is actually a pretty serious issue on Linux sometimes. So I think there's a justification to do that beyond just making virtual environments, and it's something I want to start testing soon.

I think brew switch was fundamentally flawed because it was trying to swap out dependency versions within $HOMEBREW_PREFIX which in combination with putting $HOMEBREW_PREFIX/lib in RPATH and insufficient use of shebang replacement, was inevitably going to lead to problems. It also didn't help that the response from a lot of people who didn't want it to go away was very hostile and negative, rather than trying to understand the root of the problem and provide solutions.

In essence what I'm proposing would mean that formulae should be completely unaware of the existence of $HOMEBREW_PREFIX, and only find libraries and executables in their cellar paths. This doesn't solve the problem of conflicting files names within a given environment, but that's where the symlinked prefixes come in handy.

I'm not going to pretend that is an easy task to achieve overall, but I think the fix for $HOMEBREW_PREFIX/lib in RPATH is simple, and shebangs which have not been rewritten could be caught with a relatively simple audit in CI. The fortunate thing here is that, since Homebrew was developed, we now have other tools like Anaconda and Spack who have already solved most of these issues that we can look to for ideas. Studying how they work is the only reason I even have these ideas at all!

jafd commented 3 years ago

In your particular case @jafd, you may not even need a dedicated environment if all you need is some versioned dependencies for your in-house formulae.

I need a self-contained Homebrew install that doesn't interfere with whatever the user might have in /usr/local. This seems to be pretty different. This separate install should work whether the user already has a Homebrew installation or not. This separate installation could also be, for all intents and purposes, packaged into an .mpkg installer package (done so before, too — with a lot of sweating'n'swearing, but it worked) and be truly self-contained.

I was just thinking about the bottle conundrum, and the hard problems would appear to be 1) how to tell offhand that bottle X was built for prefix Y? 2) not only the bottle checksums change if you rebuild them for some other place, their base URL changes too. So if I wanted to keep the formulae from homebrew-core intact, I'd need to decouple bottle descriptions somehow, too.

It appears that my particular use case is a bit different from other things you're trying to resolve. Maybe it could merit an issue of its own?

(Note that I'm all for prefix-agnostic and 100% relocatable installs, don't get me wrong, but the reality is that you can't patch them all.)

carlocab commented 3 years ago

You're right that this probably belongs in a separate issue. However: how are you building your bottles exactly?

I tried building a non-relocatable bottle with brew installed into a non-default prefix, and this is what I got:

❯ devbrew bottle --json --only-json-tab gettext
==> Determining gettext bottle rebuild...
==> Bottling gettext--0.21.big_sur.bottle.1.tar.gz...
==> Detecting if gettext--0.21.big_sur.bottle.1.tar.gz is relocatable...
./gettext--0.21.big_sur.bottle.1.tar.gz
  bottle do
    rebuild 1
    sha256 cellar: "/Users/carlocab/homebrew/Cellar", big_sur: "4129f9323bbae510467f54f2e3114e05a6fa027cd2b8f15af4491ea1fe44fe1d"
  end

devbrew is my alias to the non-/usr/local installation of brew.

The required cellar is also recorded in the tab:

❯ jq '.[].bottle.cellar' <gettext--0.21.big_sur.bottle.json
"/Users/carlocab/homebrew/Cellar"

So it seems to me that brew does correctly record the build prefix.

In any case, as noted above, this is a separate discussion, which we can take to the discussions page: https://github.com/Homebrew/discussions/discussions

jafd commented 3 years ago

Here it is: https://github.com/Homebrew/discussions/discussions/2031

MikeMcQuaid commented 3 years ago

In essence what I'm proposing would mean that formulae should be completely unaware of the existence of $HOMEBREW_PREFIX, and only find libraries and executables in their cellar paths. This doesn't solve the problem of conflicting files names within a given environment, but that's where the symlinked prefixes come in handy.

@danielnachun This isn't possible for all formulae. Anything with post-install that installs into the prefix will break in this scenario e.g. python, node to name two incredibly widely used formulae.

I need a self-contained Homebrew install that doesn't interfere with whatever the user might have in /usr/local. This seems to be pretty different. This separate install should work whether the user already has a Homebrew installation or not. This separate installation could also be, for all intents and purposes, packaged into an .mpkg installer package (done so before, too — with a lot of sweating'n'swearing, but it worked) and be truly self-contained.

Self-contained: easy. Self-contained with prebuilt binaries: not terribly easy but not impossible if you mandate the installation path on disk. Self-contained with relocatable prebuilt binaries: very hard.

  • how to tell offhand that bottle X was built for prefix Y?

The prefix is a product of the cellar. The cellar is part of the bottle do block (or implicit to the default cellar for that installation location).

carlocab commented 3 years ago

In essence what I'm proposing would mean that formulae should be completely unaware of the existence of $HOMEBREW_PREFIX, and only find libraries and executables in their cellar paths. This doesn't solve the problem of conflicting files names within a given environment, but that's where the symlinked prefixes come in handy.

I'm not going to pretend that is an easy task to achieve overall, but I think the fix for $HOMEBREW_PREFIX/lib in RPATH is simple, and shebangs which have not been rewritten could be caught with a relatively simple audit in CI. The fortunate thing here is that, since Homebrew was developed, we now have other tools like Anaconda and Spack who have already solved most of these issues that we can look to for ideas. Studying how they work is the only reason I even have these ideas at all!

This is not only not easy: I think this is very difficult, at least on macOS.

Currently, we point all formulae [] that depend on formula foo's libraries to look into `$HOMEBREW_PREFIX/opt/foo/because of the install name rewriting we do in [Keg#fix_dynamic_linkage`](https://github.com/Homebrew/brew/blob/c9f4d1d90076c08f8c20dcc9772796e49321ab22/Library/Homebrew/extend/os/mac/keg_relocate.rb#L44). We deliberately do not point these formulae to cellar paths, because we don't want library paths to become needlessly invalidated by version/revision bumps that do not change the library version or API/ABI.

Making formulae look in cellars instead of opt paths probably requires one of:

  1. accepting that we just have to rebuild bottles even if all that's changed is a cellar path (i.e. the newly-built libraries are ABI/API compatible); or,
  2. setting up some way to rewrite install names of dependent formulae with brew upgrade, and actually track when this needs to be done.

This might also require rebuilding all the bottles that we currently have.

Alternatively, we could try to exploit RPATH a bit more on macOS to make the logic a bit more similar to Linux, where things are (presumably) simpler. This probably entails installing all libraries as some variant of @rpath/libname.dylib instead of /usr/local/opt/name/lib/libname.dylib. A couple of barriers to this:

  1. This will break many build scripts that use Homebrew-provided libraries, as they will additionally need to specify -Wl,-rpath,/path/to/libs rather than just -L/path/to/libs in their LDFLAGS.
  2. We'll need to fix up brew linkage on macOS, as it currently doesn't do a linkage check for variable-prefixed libraries (i.e. those that start with @rpath, @loader_path, @executable_path)

It's also not clear to me that cellar-path invalidation isn't a problem if we rely on just RPATHs instead of complete install names (though I will admit having to fix up only directory references is probably a bit easier than having to fix up complete paths to libraries).

Point 1 above is going to lead to lots of unhappy users. (I had previously contemplated trying to rely on RPATHs more on macOS after my recent work on RPATH relocation and concluded it had too many downsides relative to the potential benefits.)

[*] All formulae with the possible exception of foo, that is, which we try to make sure still finds its own libraries in the cellar.

danielnachun commented 3 years ago

In essence what I'm proposing would mean that formulae should be completely unaware of the existence of $HOMEBREW_PREFIX, and only find libraries and executables in their cellar paths. This doesn't solve the problem of conflicting files names within a given environment, but that's where the symlinked prefixes come in handy.

@danielnachun This isn't possible for all formulae. Anything with post-install that installs into the prefix will break in this scenario e.g. python, node to name two incredibly widely used formulae.

For those 2 packages at least, it would make sense to rerun the post-install when they are included in a new environment. I could imagine adding some logic to the hypothetical environment creation command (something like brew env new etc.) which would rerun postinstall blocks with the new HOMEBREW_PREFIX. From what I understand the part of Python, for example, that is in the cellar doesn't actually know about what we installed in HOMEBREW_PREFIX, so if that prefix changes it doesn't affect what's in the cellar (this would need a lot of testing).

danielnachun commented 3 years ago

This is not only not easy: I think this is very difficult, at least on macOS.

Currently, we point all formulae [] that depend on formula foo's libraries to look into `$HOMEBREW_PREFIX/opt/foo/because of the install name rewriting we do in [Keg#fix_dynamic_linkage`](https://github.com/Homebrew/brew/blob/c9f4d1d90076c08f8c20dcc9772796e49321ab22/Library/Homebrew/extend/os/mac/keg_relocate.rb#L44). We deliberately do not point these formulae to cellar paths, because we don't want library paths to become needlessly invalidated by version/revision bumps that do not change the library version or API/ABI.

Making formulae look in cellars instead of opt paths probably requires one of:

  1. accepting that we just have to rebuild bottles even if all that's changed is a cellar path (i.e. the newly-built libraries are ABI/API compatible); or,
  2. setting up some way to rewrite install names of dependent formulae with brew upgrade, and actually track when this needs to be done.

I really like the idea of using brew upgrade to rewrite install paths and RPATH of installed dependent formulae. That would save us the trouble of rebuilding things when the libraries are still API/ABI compatible, while still allowing us to use cellar paths. We know which formulae a user has installed that depend on a given formula when it is being upgraded, and we already have the tools to do the rewrites.

This might also require rebuilding all the bottles that we currently have.

Unfortunately to get rid of HOMEBREW_PREFIX/lib from RPATH, it would require rebuilding most formulae, especially on Linux (less so on macOS since we try to remove RPATH when unneeded). I would propose incrementally implementing some of these changes to improve the independence of bottles from HOMEBREW_PREFIX. I would guess that there are a small number of very popular formulae which would be needed for a lot of these virtual environments where rebuilding would make the biggest difference, and then over time we'd be able to get to the rest.

Alternatively, we could try to exploit RPATH a bit more on macOS to make the logic a bit more similar to Linux, where things are (presumably) simpler. This probably entails installing all libraries as some variant of @rpath/libname.dylib instead of /usr/local/opt/name/lib/libname.dylib. A couple of barriers to this:

  1. This will break many build scripts that use Homebrew-provided libraries, as they will additionally need to specify -Wl,-rpath,/path/to/libs rather than just -L/path/to/libs in their LDFLAGS.
  2. We'll need to fix up brew linkage on macOS, as it currently doesn't do a linkage check for variable-prefixed libraries (i.e. those that start with @rpath, @loader_path, @executable_path)

It's also not clear to me that cellar-path invalidation isn't a problem if we rely on just RPATHs instead of complete install names (though I will admit having to fix up only directory references is probably a bit easier than having to fix up complete paths to libraries).

Point 1 above is going to lead to lots of unhappy users. (I had previously contemplated trying to rely on RPATHs more on macOS after my recent work on RPATH relocation and concluded it had too many downsides relative to the potential benefits.)

[*] All formulae with the possible exception of foo, that is, which we try to make sure still finds its own libraries in the cellar.

I think if we can replace RPATH or install names in installed bottles when an API/ABI compatible change is made to a dependency, we wouldn't need to change our use of RPATH either way on macOS.

danielnachun commented 3 years ago

It's occurred to me that trying to make kegs more isolated and independent from HOMEBREW_PREFIX, and supporting multple symlinked prefixes (as opposed to multiple cellars) are related but separate goals. Perhaps it would make sense to open a separate issue for this? There are several concrete changes I think I can propose in the short term that I could propose there.

MikeMcQuaid commented 3 years ago

Making formulae look in cellars instead of opt paths

We shouldn't do this. We've spent a lot of time doing the opposite.

  • This will break many build scripts that use Homebrew-provided libraries, as they will additionally need to specify -Wl,-rpath,/path/to/libs rather than just -L/path/to/libs in their LDFLAGS.

We shouldn't do this.

For those 2 packages at least, it would make sense to rerun the post-install when they are included in a new environment. I could imagine adding some logic to the hypothetical environment creation command (something like brew env new etc.) which would rerun postinstall blocks with the new HOMEBREW_PREFIX. From what I understand the part of Python, for example, that is in the cellar doesn't actually know about what we installed in HOMEBREW_PREFIX, so if that prefix changes it doesn't affect what's in the cellar (this would need a lot of testing).

Redoing this every time is going to slow things down a lot, possibly to the point that the feature is unused.

while still allowing us to use cellar paths.

I don't think we should use cellar paths. They are an implementation detail we should hide wherever possible.

Unfortunately to get rid of HOMEBREW_PREFIX/lib from RPATH, it would require rebuilding most formulae, especially on Linux (less so on macOS since we try to remove RPATH when unneeded).

We've rebuilt almost everything in the last couple of months for Linux. We do so for every macOS release. I don't think that should rule things out (as long as we maintain backwards compatibility, of course).

It's occurred to me that trying to make kegs more isolated and independent from HOMEBREW_PREFIX, and supporting multple symlinked prefixes (as opposed to multiple cellars) are related but separate goals. Perhaps it would make sense to open a separate issue for this? There are several concrete changes I think I can propose in the short term that I could propose there.

I'd like to see more articulation of how these changes help the majority of Homebrew users before we split into more issues. As-is, it looks like a lot of work and complex changes to support a tiny majority of users who want this behaviour.

carlocab commented 3 years ago

Making formulae look in cellars instead of opt paths

We shouldn't do this. We've spent a lot of time doing the opposite.

Agreed. This seems to be a key part of what @danielnachun wants to do, however. I don't see how it would be feasible to install multiple versions of the same formula in a single cellar without this.

  • This will break many build scripts that use Homebrew-provided libraries, as they will additionally need to specify -Wl,-rpath,/path/to/libs rather than just -L/path/to/libs in their LDFLAGS.

We shouldn't do this.

Also agreed.

danielnachun commented 3 years ago

I don't think we should use cellar paths. They are an implementation detail we should hide wherever possible.

I definitely agree that they should be hidden when possible. Specifically in the case of using opt_lib in RPATH/install names, my understanding is that we do this to avoid having to rebuild binaries every time a formula gets an updated which does not break ABI compatibility. If we can achieve the same outcome by updating RPATH/install names of installed dependents when a formula is upgraded, I guess I don't see the downside of using cellar paths. The same could be said of updating shebangs, if we use opt_lib for those. However I readily acknowledge that I was not involved in the project at the time those design decisions were made, and there may be other mitigating factors I'm not aware of.

Interestingly, when I grepped for opt_lib in homebrew/core, I was surprised to only see 452 formulae, and a sizable chunk of those seem to only use it test blocks, which is totally fine. In addition to the previously mentioned updating of RPATH/install names of dependents, for the bottled formula itself the opt_lib paths could be replaced by cellar paths duing bottling or bottle pouring, so that no formulae would need to be changed intially. And if we did eventually want to eliminate the use of opt_lib in install blocks, the number of affect formulae may not be that large.

We've rebuilt almost everything in the last couple of months for Linux. We do so for every macOS release. I don't think that should rule things out (as long as we maintain backwards compatibility, of course).

It's occurred to me that if we do develop a way to replace RPATH in installed bottles, then we wouldn't even need to rebottle anything to get rid of HOMEBREW_PREFIX/lib, as it could simply be replaced with the opt_lib (or cellar path eventually) for each dependency.

I'd like to see more articulation of how these changes help the majority of Homebrew users before we split into more issues. As-is, it looks like a lot of work and complex changes to support a tiny majority of users who want this behaviour.

This is critical as what is being proposed here is complex, and should have a good motivation behind it. I am generally not very sympathetic to the desire to maintain older versions of formulae around, particularly when it is because people are trying to use Homebrew as a library manager, and especially when it is because an upstream developer refuses to provide ABI/API compatibility between minor version updates.

Where I think it may be more useful is in research, where I and several other maintainers and a decent chunk of our user base work. We find ourselves needing to run specific versions of tools (which may require older dependencies) in order to reproduce results. While Anaconda can be used for this purpose and I think it has some great ideas for things like binary relocation, it also suffers from dependency hell and does not have the same level of package curation that we do here. On more than one occasion my fellow researchers have expressed a desire to be able to use Homebrew for this purpose instead of Anaconda or containers because of the benefits I just mentioned.

I guess I'm thinking about this in the same spirit as https://github.com/Homebrew/discussions/discussions/2031: to find a way to make this work without disrupting or inconveniencing our existing functionality. I would not propose changing the policy in homebrew-core of only maintaining the latest version of formulae. But it would be great if we were able to make these virtual environments possible to use in other taps like BrewSci where there would be more value in being able to maintain older versions of formule.

Apologies if my posts on this are lacking in brevity! I have spent a lot of time grappling with this issue of managing multiple software versions in my job so I've thought a lot about possible solutions, and I think it's good to hash out the details.

MikeMcQuaid commented 3 years ago

Specifically in the case of using opt_lib in RPATH/install names, my understanding is that we do this to avoid having to rebuild binaries every time a formula gets an updated which does not break ABI compatibility.

Partly. Also partly so tooling outside of Homebrew doesn't see/rely on/use the cellar paths because they are not exposed to them.

We find ourselves needing to run specific versions of tools (which may require older dependencies) in order to reproduce results.

Our way of doing this is @ versions and brew extract (or checking out homebrew-core to an older revision). Beyond that: I really don't want us to increase our support for arbitrary old version usage because it's a bad security practise much of the time (albeit not in the case you mention, to be clear).

I would like to lean into finding ways to make @ versions and brew extract work better, perhaps figuring out how to make use of e.g. old bottles (which is currently tricky/not possible due to the actual formula name changing).

Apologies if my posts on this are lacking in brevity! I have spent a lot of time grappling with this issue of managing multiple software versions in my job so I've thought a lot about possible solutions, and I think it's good to hash out the details.

No apologies needed! Your input here is great ❤️

danielnachun commented 3 years ago

Our way of doing this is @ versions and brew extract (or checking out homebrew-core to an older revision). Beyond that: I really don't want us to increase our support for arbitrary old version usage because it's a bad security practise much of the time (albeit not in the case you mention, to be clear).

I would like to lean into finding ways to make @ versions and brew extract work better, perhaps figuring out how to make use of e.g. old bottles (which is currently tricky/not possible due to the actual formula name changing).

I totally forgot about brew extract, and this could potentially make this problem a lot simpler. Here's one idea - we add something like --freeze as an option to brew extract. This flag would checkout homebrew-core at the specific revision corresponding to whatever version the user wants, and would recursively install any older dependencies as needed. In this "frozen" formula, we would replace opt dir paths in RPATH/install names/shebangs with the exact cellar path.

This I think would leave the bottles which are currently linked to $HOMEBREW_PREFIX/opt completely unaltered, while still allowing us to use exact cellar paths when an exact versioned dependency is needed. Since the whole purpose of "freezing" a formula's dependencies is not to upgrade them, we wouldn't have to worry about rebuilding after ABI-incompatible dependency updates. The only change that I believe would be needed to the way brew behaves right now would be that when brew upgrade is run, we would have to check if the previous version is needed by any "frozen" formulae (this would have to be recorded somewhere)), and if so then this previous version would also be "frozen" instead of being deleted.

In this scenario, $HOMEBREW_PREFIX/opt would still point to the newest version, and we would not allow brew link to link "frozen" kegs into $HOMEBREW_PREFIX (but they could be linked to an environment in $HOMEBREW_PREFIX/envs). In order for a formula to require a specific version of a dependency, some sort of DSL with a syntax like depends_on "formula" => version: "version_number" could be used (and automatically added by brew extract --freeze), but it would be forbidden by an audit in homebrew-core. We'd also want to add something to brew doctor to throw clear warnings if someone manually linked a "frozen" keg into $HOMEBREW_PREFIX.

Hopefully this would allow for exact dependency versioning using existing bottles in unsupported taps linked to alternative prefixes, while making sure that homebrew-core and formulae linked to $HOMEBREW_PREFIX still remain "evergreen".

carlocab commented 3 years ago

One difficulty here is that brew doesn't always know how to parse old formulae, because we make changes to the formula DSL and eventually drop support entirely for old syntax.

We could maybe encode the Homebrew/brew commit hash at the point of bottling somewhere, but I'm ambivalent about doing this. Build reproducibility is nice; security vulnerabilities are not.

MikeMcQuaid commented 3 years ago

I totally forgot about brew extract, and this could potentially make this problem a lot simpler. Here's one idea - we add something like --freeze as an option to brew extract. This flag would checkout homebrew-core at the specific revision corresponding to whatever version the user wants, and would recursively install any older dependencies as needed. In this "frozen" formula, we would replace opt dir paths in RPATH/install names/shebangs with the exact cellar path.

Perhaps a stupid question: why do you need freeze at all? Why would checking out the specific revision be insufficient? Also: this is essentially what brew versions did and the support burden was fairly high (and presented bugs we are literally unable to fix due to the Git history being immutable) so would need to be caveated massively.

The only change that I believe would be needed to the way brew behaves right now would be that when brew upgrade is run, we would have to check if the previous version is needed by any "frozen" formulae (this would have to be recorded somewhere)), and if so then this previous version would also be "frozen" instead of being deleted.

HOMEBREW_NO_INSTALL_CLEANUP kinda does this already. Fundamentally, though, I think having "frozen" and "up-to-date" versions of software living in the same prefix or cellar is a problem waiting to happen.

One difficulty here is that brew doesn't always know how to parse old formulae, because we make changes to the formula DSL and eventually drop support entirely for old syntax.

Yes, for a homebrew/core pin to work you'd need to pin homebrew/brew to an old version, too. Of course, this also means you'd not get any fixes for e.g. newer macOS versions.

In this scenario, $HOMEBREW_PREFIX/opt would still point to the newest version, and we would not allow brew link to link "frozen" kegs into $HOMEBREW_PREFIX (but they could be linked to an environment in $HOMEBREW_PREFIX/envs).

This is starting to sound pretty complicated...

Hopefully this would allow for exact dependency versioning using existing bottles in unsupported taps linked to alternative prefixes, while making sure that homebrew-core and formulae linked to $HOMEBREW_PREFIX still remain "evergreen".

I don't think we really want to enable sitting on old versions in taps indefinitely, either. To jump back to what you said earlier:

Where I think it may be more useful is in research, where I and several other maintainers and a decent chunk of our user base work. We find ourselves needing to run specific versions of tools (which may require older dependencies) in order to reproduce results.

In this case: sitting on an old version of both Homebrew/brew AND Homebrew/homebrew-core AND having the base system be the same (i.e. not macOS as macOS/Xcode versions change too often) should provide this reproducibility. It feels like something a brew reproduce or similar external command could nicely handle and use e.g. a Docker image for, too. All that would need to be published for reproducibility is the two Git revisions and e.g. the Docker image and its revision.

We could maybe encode the Homebrew/brew commit hash at the point of bottling somewhere, but I'm ambivalent about doing this. Build reproducibility is nice; security vulnerabilities are not.

homebrew_version and tap_git_head is already stored in the tab for bottles (and in the JSON for bottles built with --only-json-tab e.g. https://github.com/Homebrew/homebrew-core/pkgs/container/core%2Fopenssl%2F1.1/6980813)

Hopefully this isn't too blunt/rude: this data has been stored for literally years with the promises that it will be used for this sort of reproducibility. I personally have made multiple PRs to enable the workflows I describe above but: no-one has stepped up to actually do the work to stitch it together. I hope that someone does that but until then I'm a pretty hard 👎🏻 on any more changes or features being made to Homebrew/brew or Homebrew/homebrew-core to potentially enable some functionality in future.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.