Open hasufell opened 5 months ago
Branch which re-enables --upgrade-dependencies
(in a unfinished way) - https://github.com/mpickering/cabal/tree/wip/upgrade-dependencies
It seems sensible to me to choose pre-installed packages, then you don't have to build them again. If the version constraints don't disallow it then the solver could choose that install plan anyway. If you really don't want a package to be part of the install plan then perhaps instead you want a means to instruct the solver to never choose a particular version as an additional constraint form.
Regardless of the outcome of the discussion, it would make sense to find the PR that commented --upgrade-dependencies
out, understand how this happened and prevent for the future, e.g., by guarding this functionality with tests and also by documenting it (better).
@mpickering I'm worried about defaults here. I don't think there's an easy way to tell my users that there was a subtle bug in filepath
in splitFileName
. No one reads the changelog. GHC already ships the version.
What if this is a security bug? What do I do? I have no communication channels. A cabal update && cabal build
should leave your project in the best possible state (bugfixes, security fixes) without further interventions/constraints required by the end users.
This is how all linux distributions work, afaik. "Saving compilation time" seems like a questionable priority, imo (as a default).
@mpickering also says about --upgrade-dependencies
: "There is the -prefer-oldest
option already and this I suppose is a --prefer-newest
... decide how it should interact with --prefer-oldest
".
CC @grayjay, @gbaz
I agree that --prefer-newest (so, no special-casing the boot packages) would be a cleaner default. Perhaps, there should be --prefer-installed to get the current default if we change the default to --prefer-newest?
While I like the default @hasufell proposed better, for the reasons give and also because it's more uniform, I worry it would hit hard the users with big sets of installed packages, which may include Nix users, Linux distribution users and v1-/cabal-env users (whether for teaching purposes or others). So we'd need some good backward compatibility scheme.
Another solution, which unfortunately increases complication, might be to treat libraries installed together with GHC (and perhaps all installed not by the user directly, but by install/upgrade scripts) specially and apply --upgrade-dependencies
to them, while keeping user-installed libraries immutable, on the premise that the user knows what the user is doing (and that installing libraries is rare and discouraged).
Edit: which somehow agrees with how we treat local packages even if newer versions are on Hackage, local packages being "user-installed" and so automatic upgrades being disabled (I think?). I remember apt
keeps track which packages are directly requested by the user and which are only installed as dependencies, but this is a very distant analogy to installed Haskell packages and how cabal treats them.
CCing @Ericson2314 @angerman wrt Nix
I don't think there's an easy way to tell my users that there was a subtle bug in filepath in splitFileName. No one reads the changelog. GHC already ships the version.
What happens if you deprecate the version with the bug (the installed version)? Will cabal still prefer it?
Re Nix, I would like to have no notion of "preinstalled dependencies" because one should not be "preinstalling" things with Nix. So I like this.
In conclusion, Proud Nix Hater @hasufell has proposed something that I think is actually great for Nix. Thank you! :)
I'm a bit confused. In fact, the way we use nix at work involves "pre-installing" everything in the sense that everything goes into a package database which nix provides, instead of the cabal store, no?
So with the current behavior, if I am building foo
which depends on bar-1
and the latter is in the package database nix provides, then even if bar-1.1
is released, cabal build
will still use bar-1
. With the new behavior, cabal-build
would download and build bar-1.1
even though the nix configuration specified bar-1
.
(the workaround for this, which is possible but slightly irritating, is to use a cabal.project or the like to disable hackage or any other package repository for packages developed in a nix provided environment)
All that said, I modestly prefer the current behavior, in part because I'm afraid of changing this sort of stuff given large and unpredictable effects it may have on many users, and in part because users will expect that if they have a containers
that works installed, then cabal will just use it.
I do think an explicit flag like upgrade-dependencies
(though that is a terrible name, given its semantics, and prefer-newest
or the like is better) is useful, to make this behavior more controllable.
I don't think cabal has any obligation to "not break nix". It's nix packagers obligation to keep it working.
Changes like the proposed one would be communicated early enough with a migration period, so that users can adapt and opt out of the changed behavior.
This is the same with the v1 vs v2 change that the cabal team executed over several years. Except this one seems much less disruptive.
I can't see how the current behavior is a sensible default from any angle, if it causes average users to miss bugfixes. It is not safe.
@gbaz What I mean is that in Haskell.nix-style approaches planning takes place with no / empty package database. Ideally even for "cabal build
in Nix shell" usage, we'd still use the original pure / ex nihilo plan.
The current trick of "re-planning" in the Nix shell and hoping it solves for as little few possible not-already-built things as possibly is comparatively gross, and (IIRC) runs into issues when sources are funny (e.g. modified local packages).
(That said, the "already installed" constraint is useful for the above hack, and I imagine also useful for anyone that is wondering why their boot packages aren't being used under this issue's proposal.)
@mpickering I'm worried about defaults here. I don't think there's an easy way to tell my users that there was a subtle bug in
filepath
insplitFileName
. No one reads the changelog. GHC already ships the version.What if this is a security bug? What do I do? I have no communication channels. A
cabal update && cabal build
should leave your project in the best possible state (bugfixes, security fixes) without further interventions/constraints required by the end users.Speaking as a member of the Haskell Security Response Team, our hope is that cabal-install will be enhanced to directly use the data from the advisory database, and either omit affected packages from build plans by default, or alert users when build plans contain affected packages.
This issue poses some good questions but I don't think the SRT would have an opinion on it one way or the other, given the objective of more explicit cabal-install features/behaviour regarding known security issues.
The issue with forcing as much newest dependencies as possible is that your library/app might end up with a very different set of dependencies than your tests. If tests involve doctest
, which is quite common, their build plan includes ghc
-the-package and so sets in stone all boot libraries as shipped with GHC. The lib/app most likely does not depend on ghc
and would be free to build against latest and greatest boot packages. Overall effect would be that you are testing not what you are shipping.
I think making ghc
reinstallable would be an important stepping stone.
What is perplexing for me is the following. Suppose we have
filepath
which just happens to come with GHC, say filepath-4.3.1
wombat
which just happens not to come with GHC.wimwam
using cabal
, and it turns out that cabal's build plan installs the dependency wombat-2.7.2
wombat
and filepath
release bug-fixes, say wombat-2.7.3
and filepath-4.3.2
foogle
which depends on wombat
and filepath
, but with very open upper bounds.Question: when installing foogle
which versions of wombat
and filepath
will cabal pick? I understand @hasufell as saying that it will pick
filepath-4.3.1
because it is "pre-installed"wombat-2.7.3
, because cabal picks the newest if it can, even though wombat-2.7.2
is already installed.I am baffled about why we could possibly want to treat wombat
and filepath
differently, just because filepath
happens, through some accident of fate, to come with GHC.
Which choice is best isn't obvious to me. But I can't see any justification for treating the two differently.
@simonpj
I am baffled about why we could possibly want to treat wombat and filepath differently, just because filepath happens, through some accident of fate, to come with GHC.
This is mostly correct, with the caveat that cabal treats any package that is in the global package db specially. It just so happens that cabal v2-build
and cabal v2-install
(which are now default) don't touch the global package db anymore as opposed to cabal v1-install --global
(legacy). So for most users, the global package db just contains what GHC ships with.
That, imo, makes it even worse. There are many other mechanisms to avoid cabal rebuilds (e.g. just don't run cabal update
, use a freeze file, pass certain flags to cabal, ...). For saving space it seems totally backwards and what we actually want there is: https://github.com/haskell/cabal/issues/3333
@Bodigrim
If tests involve doctest, which is quite common, their build plan includes ghc-the-package and so sets in stone all boot libraries as shipped with GHC.
I understand doctest is a special case, but I do not believe this justifies having the current default.
Overall effect would be that you are testing not what you are shipping.
It is the maintainers responsibility to ensure testing across multiple setups. If they can't do that, then their cabal version bounds are simply wrong or their test suite is sub-par.
For anyone wondering, there are two practical solutions to avoid doctest:
ghc
library, but ghci@frasertweedale
Speaking as a member of the Haskell Security Response Team, our hope is that cabal-install will be enhanced to directly use the data from the advisory database, and either omit affected packages from build plans by default, or alert users when build plans contain affected packages.
I don't want to digress too much on this, but I'm rather surprised by this sentiment and disagree rather strongly (wrt this being enough).
I'll put my response in a collapsible section to keep the thread clean. I'm happy to continue that discussion privately or on the security response team issue tracker.
@simonpj:
I install a package wimwam using cabal, and it turns out that cabal's build plan installs the dependency wombat-2.7.2
I think modern cabal works rather differently. The new cabal v2-build (or v2-install) builds wombat-2.7.2 locally and does not "install" it, at least not in the same sense that GHC distribution installs the packages it comes with. Modern cabal discourages installing any libraries and encourages building them anew (with smart caching via "store").
In fact, if GHC stopped providing/exposing the bundled packages, the problem of the exceptional treatment of installed packages would be immediately gone (until the user insists on manually installing some other packages, which is discouraged and hard to do properly). If GHC ships with the packages so that the user saves on compilation, it's no wonder cabal tries to accommodate it. However, I'm guessing GHC exposes the packages, because the ghc
package (and any other non-reinstallable packages?) depends on them and ghc
can't be re-built/reinstalled/relinked (in particular, to depend on different versions/builds of its dependencies). See https://github.com/haskell/cabal/issues/9064#issuecomment-1609929980 and many related issues. I'm sure @bgamari or @mpickering could easily confirm or deny.
Therefore, the inconsistent cabal behaviour may be caused primarily by ghc
and others not being reinstallable/rebuildable (and secondarily, by attempting backward compatibility for old v1-/Setup workflows, such as Nix, Linux distros, old setups for Haskell courses where each student is supposed to have the same versions of dependencies and freeze files were not yet a thing). If so, we can wait until ghc
/others are reinstallable and then the problem vanishes (unless the user introduces it independently). Or we could try to limit the special cabal behaviour to build plans that include ghc
, but then such build-plans are treated specially. If template-haskell
is another case of a non-reinstallable package that depends on reinstallable packages (I remember rebuilding it in the past, but that's no longer possible, I think?), this makes the specially treated build plans much more common and harder to describe succinctly to the user.
Am I anywhere close to the root cause of keeping this old functionality in modern cabal? Can cabal handle GHC in some alternative way without incurring this irregular behaviour? E.g., what can go wrong if GHC renames all the packages (in the package db and/or on Hackage) it bundles so that they can't be reinstalled at all?
Edit: actually, what happens if cabal reinstalls a dependency of ghc
and uses it alongside the other copy of this dependency baked into ghc
? I guess no outright disaster, but there can be subtle bugs due to subtle changes in behaviour between the versions? Is that why cabal is reluctant to reinstall?
Therefore, the inconsistent cabal behaviour may be caused primarily by ghc and others not being reinstallable/rebuildable
Yes indeed:
ghc-the-package
ghc-the-package
depends onBut that is a simple consequence of depending on ghc-the-package
, which in turn depends on a particular wired-in version of filepath
.
But let's suppose that your build plan does not depend on ghc-the-package
or template-haskell
(a very common case). Now filepath
has no constraints -- cabal is entirely free to rebuild it locally. The fact that there is a pre-installed version is irrelevant, no? So my question remains: why is filepath
(and other packages that happen to come with GHC) treated specially?
Therefore, the inconsistent cabal behaviour may be caused primarily by ghc and others not being reinstallable/rebuildable
I think this is not quite right. I don't think anyone is asking to change the behaviour when you depend on a non-reinstallable package: there we really do have to go with what GHC ships, and that's that.
I think the request is about packages that are reinstallable, but happen to have versions in the global package-db, like filepath
. Then the preference seems less justifiable.
The reason that things in the GlobalPackageDB
are treated specially is because of the definition of corePackageDbs
in Distribution.Client.ProjectPlanning
. This only looks in GlobalPackageDb
and any extra package databases that a user has configured.
In general, it is a bit of an issue that cabal-install
and Cabal
assume anything about the structure of package databases (see #3728). There are many assumptions baked into both projects that you want to use the global package database and the things in there are privileged. This is primarily a legacy from the old days when GHC
was much less flexible about being able to specify a package database stack, but now it is completely agnostic.
why is filepath (and other packages that happen to come with GHC) treated specially?
Then the preference seems less justifiable.
My guess is that cabal covers the case of dependencies of non-reinstallable package in a lazy way --- by treating specially all packages that reside in the relevant package DB [edit; and regardless of the build plan]. This is has several advantages: simplicity of implementation, simplicity of configuration (though, as @mpickering states, this may be too hard-wired at this point), an extra benefit of backward compatibility for other legacy workflows using a central package DB, simplicity of conveying the behaviour to the user (though it's probably not conveyed yet or not well enough). Edit: one more advantage: this primitive solution does not increase the coupling of GHC and cabal, because the list of non-installable packages that changes between GHC versions (#9092) does not need to be used for yet another purpose in cabal code.
Which is why I'm considering the other option, changing the behaviour of GHC, not of cabal. Or even of the GHC installer, e.g., making it rename all the packages it installs [edit: a less brutal variant: install them to another package db, if that matters]. That's probably absurd for fundamental reasons, but I'd like to improve my understanding of the situation by learning why exactly.
Edit: to be fair, the improvement of analysing the build plan (and auto-upgrading if the package in question is not a dependency of a non-reinstallable package) would not detract from the ease of configuration (though it would eliminate the backward compatibility side-benefit). However, I'm not able to predict what interplay it could have with the solver (because deciding to auto-upgrade changes the build plan, so perhaps we need to solve anew to verify all constraints are respected? what if the new solution implies the package should not be automatically upgraded?).
The comments here are swaying me towards supporting a change here. However, I really do feel it needs to be flag-controlled and I'm definitely worried that any deep change like this could well confuse workers and disrupt workflows in a way that is very unexpected and hard to diagnose, especially for those who don't read release notes.
That sounds fine to me. It's definitely a behavior change, quite arguably a breaking one. A new flag, switch default, etc. deprecation cycle should be fine.
OK, let's assume we want the change and the minimal goal is to let users ensure they get the latest Hackage versions of all reinstallable packages without the need to edit any bounds anywhere nor adding the --constraint
option (but other options are fine).
Let's brainstorm and list all avenues we have, from the least likely to cause breakage to the most likely. I will update the list as proposals and facts emerge (I'm quite clueless).
--upgrade-dependencies
, document it, guard it against removal with tests.--upgrade-dependencies
behaviour the default, provide the opposite flag (e.g., --prefer-globally-installed
) with a "deprecation cycle" before the default changes.Any more ideas?
Is 3 and 4 the same? Is 2 possible at all? Once we stop brainstorming, but not before, let's focus on drawbacks and benefits of the alternatives.
There's one more subtlety that can be confusing (and is in fact an argument for removing the special treatment).
GHC sometimes ships libraries with different cabal flag configurations than you would get installing straight from hackage. As @bgamari notes here text-2.0
shipped with GHC has simdutf
flag disabled, but the version on hackage has it enabled by default.
This can be confusing as witnessed here: https://github.com/haskell/text/issues/487
Personally I would prefer a positive flag for preferring flags in the global package db, maybe --prefer-globally-installed
, which could be on by default with a deprecation cycle to change the default. --upgrade-dependencies
does not jump out to me as related to this issue!
GHC sometimes ships libraries with different cabal flag configurations than you would get installing straight from hackage
I think this is just https://github.com/haskell/cabal/issues/8702 ? Although if that was fixed I'm unsure what we would expect to happen? Maybe:
text
, which in the absence of user input would be +simdutf
text
does not match that flag selection so doesn't use ittext
(which might be fine!)(EDITED: I got the default flag setting backwards)
The cabal solver seems to treat pre-installed packages specially (e.g. those shipped with GHC).
To reproduce:
This should cause a failure, because ghc-9.4.8 ships with filepath-1.4.2.2, but the package above uses modules from 1.4.100.1. The package has no upper bounds on filepath. For any other non-pre-installed package, the solver would pick the latest.
I understand that this is by design, but I question this design here, because:
@mpickering found out that there used to be a
--upgrade-dependencies
switch, which is now disabled.I argue that the default should be to pick the latest possible version anyway.
CCing some potentially interested parties: @simonpj @frasertweedale