Open martijnbastiaan opened 2 years ago
While seemingly opt-in, GEFs are currently the only way Cabal communicates what package databases it used and what packages are exposed to subcommands.
provide enough information to recover build setup (you need to guess where the compiler executable is though). E.g. https://github.com/phadej/cabal-extras/tree/master/cabal-docspec uses these and doesn't rely on environment files at all.
Thanks for the links! IMO these solutions are much more involved from a user perspective than Cabal setting an environment variable. The alternative, the way Stack handles it and what's proposed here, needs no specific support from GHC API consumers at all. And at the risk of sounding very negative, as a developer I'm not particularly thrilled about relying on external packages parsing Cabal internals and having two separate code paths for Stack and Cabal.
This is in no way trying to diminish the work that went into cabal-plan
and cabal-install-parsers
, I've used both and am very happy they exist.
(As a funny note: I did look into cabal-install-parsers
for doctest-parallel
but using it broke our CI in the weirdest way possible: https://gitlab.com/clash-lang/clash-compiler/-/jobs/1827067901. I never really got to the bottom of that. Presumably it forced an outdated dependency..?)
Let see.
I guess cabal-install
could always write an environment file next to plan.json
. It's a subset of the same information (+ location of package database), but in different format.
That feature shouldn't affect whether environment files are written in the root of the project: their use case is different.
Setting up in GHC_ENVIRONMENT
in cabal run
(and cabal test
) is one more implicit thing: bad. It should be possible to turn it off. In fact, I think it should be opt-in, i.e. off by default.
Thanks for the quick reply.
I guess
cabal-install
could always write an environment file next toplan.json
. It's a subset of the same information (+ location of package database), but in different format.
Cool
That feature shouldn't affect whether environment files are written in the root of the project: their use case is different.
Alright, I don't mind too much as long as the default is off.
Setting up in
GHC_ENVIRONMENT
incabal run
(andcabal test
) is one more implicit thing: bad. It should be possible to turn it off. In fact, I think it should be opt-in, i.e. off by default.
Generally I agree: explicit is better than implicit. But it doesn't seem too implicit to me that an executable run with cabal run
would run with the env. Stack has been doing that for years now, and I don't believe anyone has ever been bitten by it. Having said that, it wouldn't be a hassle to turn it on. Ideally it would be configurable per executable, something like:
test-suite doctests
[..]
with-ghc-environment: true
What do you think?
I agree with oleg that a simple baseline of always writing env files next to plan.json is a good idea -- I think this had been suggested as part of an earlier cleanup and got lost in the shuffle.
You note that cabal exec
is almost cabal shell
but doesn't build anything. I'm curious what the use case is for building things as part of it is, or why a cabal build
followed by a cabal exec
wouldn't suffice in that case?
And I also am not sure about what the specific use case is for cabal run
(and test) having the env variable set if the file is known to be available in a fixed place. Shouldn't the former suffice to help make all necessary tools aware of it? Or would they all now work if they were first dropped into a cabal exec
?
Because any change to implicit stuff can always cause user frustration and confusion (as we learned the hard way a few times) I'd very much like to understand the exact use cases here.
Writing env files to dist-newstyle is a good first step.
You note that cabal exec is almost cabal shell but doesn't build anything. I'm curious what the use case is for building things as part of it is, or why a cabal build followed by a cabal exec wouldn't suffice in that case?
It would. cabal exec
takes whatever was built, creates an environment file for it, sets GHC_ENVIRONMENT
and runs an executable with it. The reason this is unsatisfactory in my opinion is that the successful execution of a command now relies on a manual build step. If forgotten, a user is faced with all kinds of error messages. Similar to how one can just do cabal run some-exe
without running cabal build some-exe
before it, I'd like my applications to work with one command invocation too.
But yes, cabal shell $target
would roughly be equivalent to cabal build $target && cabal exec bash
.
And I also am not sure about what the specific use case is for cabal run (and test) having the env variable set if the file is known to be available in a fixed place. Shouldn't the former suffice to help make all necessary tools aware of it?
Perhaps. A technical problem I see is that dist-newstyle
isn't a static directory, so you need to somehow figure out what --distdir
was during execution. Alternatively, you could try and find out where the running executable is located and traverse up from there in hopes of finding the env file. But more importantly, this assumes a very Cabal centrist look in the first place. The two projects I mentioned can be used with Cabal, Stack, and Nix. I don't have to write any workarounds for the latter two, and I was hoping Cabal could prepare the environments executables run in in a similar way.
Because any change to implicit stuff can always cause user frustration and confusion (as we learned the hard way a few times) I'd very much like to understand the exact use cases here.
Very fair. For what it's worth, I really struggle to imagine an executable that would want to access the GHC API without having access to the packages Cabal just built. Also, given that Nix and Stack have both been doing this for years now, I'm inclined to say it isn't really a problem. Lastly, if applications would want to access the GHC API without having access to built libraries, it would be as simple as erasing GHC_ENVIRONMENT
; which is much less effort than what it takes to properly integrate with Cabal right now.
I'm curious what the use case is for building things as part of it is
To give a very specific answer to this, I'm currently working on two projects: doctest-parallel
and clash-{prelude,lib,ghc}
. The first one uses the GHC API (indirectly, by spawning ghci
) to execute tests specified in documentation strings. In order to properly work, ghci
needs access to all the dependencies and libraries of the project. The second project is a compiler that compiles Haskell's core into synthesizable hardware designs (i.e., hardware descriptions you could build silicon out of). In order to do this, Clash needs to load core representations / unfoldings using the GHC API which again relies on access to all the packages built by Cabal (or Stack, or Nix).
As @phadej correctly remarked, all this information is available in the dist-newstyle
directory. In theory Clash and doctest-parallel
could use this information to extract build information and pass it down to the GHC API / ghci. However, this would require both projects to build workarounds digging into Cabal internals (while Stack and Nix both work "out of the box"), which leaves a sour taste in my mouth.
I currently work around this by recommending users to put write-ghc-environment-files: always
in their cabal.project
or "simply" use Stack. I don't like the former because it reintroduces all the unexpected behavior of env files, and I don't like the latter because I don't like pushing users to any specific technology.
Very fair. For what it's worth, I really struggle to imagine an executable that would want to access the GHC API without having access to the packages Cabal just built.
Thanks! Cabal itself is an interesting edge case, given that it manages its own dependencies.
In any case, would the following be (easily) implementable?
test-suite doctests
[..]
with-ghc-environment: true
In any case, would the following be (easily) implementable?
test-suite doctests
[..]
with-ghc-environment: true
It won't work as you expect. It's the Cabal
(the library) which interprets .cabal
files, builds and runs tests at the end. Also in stack and nix: the ./Setup
It is possible to make ./Setup
write environment files somewhere and set GHC_ENVIRONMENT
variable. (EDIT:: essentially just dumping what is passed to it, package-dbs and ids).
It might conflict with what stack does already atm. (test-suite "environment" is smaller then whole project).
I think you'd really need the environment variable then, as the ./Setup
's build directory is build tool dependent, cabal-install
's builddir for component is something like dist-newstyle/build/x86_64-linux/ghc-8.6.5/cabal-testsuite-3/build/cabal-tests/
for example.
That will require new cabal-version
.
Using old-and-proven doctest
might just work then, as it GHC_ENVIRONMENT
variable will tell it what it needs, you'd only need to specify global flags and where to find sources in test-suite driver. (I.e. completely? obsolete cabal-doctest
custom setup hack). That's a good thing.
I'd say it's implementable, but it's not easy.
EDIT: note that at the moment Cabal
the library doesn't know about environment files. (and/or GHC_ENVIRONMENT variable, thus #7792 e.g).
EDIT2: but it might be that some build tool doesn't actually call ./Setup test
, but rather runs the test-suite executables directly. Then with-ghc-environment
won't work. E.g. the cabal run test-suite-name
won't, so there should be a way to specify some GHC_ENVIRONMENT
for cabal run
. That sounds bad or at least inconvenient.
Hmm, I'm not familiar enough with Cabal's flow to properly understand what you're saying, but I think I get the gist.
EDIT: note that at the moment Cabal the library doesn't know about environment files. (and/or GHC_ENVIRONMENT variable, thus #7792 e.g).
Who's the one running the executables, is that cabal-install
? In that case I'd propose to add the flag to the cabal
binary and cabal.project
. I.e., you could run --run-with-ghc-environment=always
or run-with-ghc-environment: always
respectively. If I'm understanding you correctly this wouldn't interfere with Cabal-the-library. This flag would practically be the same as --write-ghc-environment-files=always
, except it writes it to dist-newstyle
and sets GHC_ENVIRONMENT
.
As far as the issue goes: it seems natural to me to unset / ignore GHC_ENVIRONMENT
for Cabal. Cabal is after all managing its own databases.
EDIT2: but it might be that some build tool doesn't actually call ./Setup test, but rather runs the test-suite executables directly. Then with-ghc-environment won't work. E.g. the cabal run test-suite-name won't, so there should be a way to specify some GHC_ENVIRONMENT for cabal run. That sounds bad or at least inconvenient.
Yes, this is a problem for Clash and doctest-parallel alike. I don't mind too much though, we just ask all our users to use cabal run
. Both are developer tools anyway, and tightly coupled to the codebase one's working on. There's no real need to install them as standalone binaries. (For Clash we do ship standalone binaries btw, but this is after using cabal-debian
or Nix - both of which provide a static environment.)
Also, I'd like to thank everyone in this thread for their contribution, it's much appreciated :)
Who's the one running the executables,
if it's cabal test
, then the underlying ./Setup
Which makes setting GHC_ENVIRONMENT
in cabal-install
tricky (impossible?) as ./Setup
should unset it to avoid implicit interactions!
TL;DR, cabal install
just calls to ./Setup
. stack
calls to ./Setup
, nix is calling ./Setup
. Tools just setup the environment to run ./Setup
successfully (i.e. manage dependencies).
Ok, so I think there's three parts here.
1) Always write an env file to dist-newstyle
2) Add a flag (which is then settable in cabal-project) to run with the env variable
3) cabal shell THING
as an alias for cabal build THING && cabal exec bash
I'm not super keen on the last still, since i hate adding new commands.
Could we instead add a flag to cabal-exec to tell it to build the deps for a given component before execing?
Who's the one running the executables,
if it's cabal test, then the underlying ./Setup
Which makes setting GHC_ENVIRONMENT in cabal-install tricky (impossible?) as ./Setup should unset it to avoid implicit interactions!
Hmm, perhaps I'm misunderstanding, but a naive read of CmdRun.hs
/ Run.hs
/ CmdTest.hs
makes me believe that while building everything might use Setup.hs
, actually running it does not:
End of runAction
:
Snippet of runProgramInvocation
:
I'm only interested in this specific invocation running with a GHC_ENVIRONMENT
set, which I believe cabal-install is able to provide at that point (or else I'm really unsure how --write-ghc-environment-files=always
works!)
@gbaz I understand the hate for adding new commands. I kinda tunnel-visioned myself on moving the env file, while writing it twice would perfectly suffice. I don't actually care about shell
myself :-)
@martijnbastiaan don't confuse run and test. tests and benchmarks are "run" during the "build" process. https://github.com/haskell/cabal/blob/980a532875efc2ddbab0761d4feb95fd247866a6/cabal-install/src/Distribution/Client/ProjectBuilding.hs#L1331-L1337, using setup...
ah oof..
Background
GHC environment files (from here on: GEFs) describe where GHC package databases reside and what packages from those databases should be exposed in a particular GHC session. As an example, here's what a GEF looks like after compiling
aeson-pretty
:Unless explicitly surpressed, any GHC session executed in either the directory containing the GEF or any directory under that will use this file to find and expose packages. Cabal used to generate these files automatically, however, this has been disabled as many users found it confusing. Users can now opt-in by passing
--write-ghc-environment-files=always
or by adding a similar section to theircabal.project
.I purposely won't go into the perceived downsides of GEFs, as I think this has been discussed many times over on various fora.
So what's wrong?
While seemingly opt-in, GEFs are currently the only way Cabal communicates what package databases it used and what packages are exposed to subcommands. This leads to some projects (
doctest-parallel
,clash-ghc
+ projects build with Clash) requiring their presence. In general, any downstream consumer of the GHC API is likely to want access to these files. There's currently a couple of ways these projects solve this, but without delving into them: this gets us right back to square one. That is, for these projects, all the downsides of GEFs are exposed once more.Brushing aside projects needing GEFs, even users who liked their behavior are now in a rough spot: they need to remember to either enable them globally, per project, or even on a per command basis.
Consumers
As far as I can see, there are two groups of GEF users:
cabal run
).Then, there's a (I believe, large) group that doesn't care at all but is prone to being bitten by unexpected behavior imposed by GEFs. This group overlaps with group (1).
Tackling this
I strongly believe a few things should happen:
dist-newstyle
.GHC_ENVIRONMENT
and set it to the GEF mentioned in (1) when running subcommands. E.g., forrun
andtest
.shell
that drops you into a shell withGHC_ENVIRONMENT
set. Likebuild
, one can set a target, or none at all. This command should be able to be run anywhere in your project. It should optionally take a command to run, e.g.cabal shell my-exe -- ghci
.Notes
In no particular order:
cabal exec
that generates a temporary GEF and setsGHC_ENVIRONMENT
. However,exec
doesn't actually build anything.--distdir
also exist.GHC_PACKAGE_PATH
andGHC_ENVIRONMENT
for subcommands. This works really well for the projects I work on..ghc.*
files - this is IMO where most of the frustration stems from.v2-*
commands in this RFC.