Open jasagredo opened 5 months ago
I told you it's not a good idea to enable/disable XDG based on directory existence :)
There is one more alternative, due to
If the CABAL_DIR environment variable is set, the indicated directory will be used to store all Cabal-related files, as in previous versions.
(e.g., see https://github.com/haskell/actions/issues/210). So, a user of this action could set this variable and be safe.
I vote for all of above. I don't see a reason the choice should be made by the [edit: images or the] runner and when it is, let's revert it early.
There is one more alternative, due to
If the CABAL_DIR environment variable is set, the indicated directory will be used to store all Cabal-related files, as in previous versions.
So maybe simply the action should set CABAL_DIR
at the start to point to ~/.cabal
and then XDG would be totally disabled?
So maybe simply the action should set CABAL_DIR at the start to point to ~/.cabal and then XDG would be totally disabled?
I think @jasagredo proposed that and it sounds fine. Then, if people want to switch their Haskell CI to XDG, creating the XDG directory won't work, they'd need to set the CABAL_DIR
accordingly. A bit more of a hassle, but less likely to be done by accident, either by the CI users or by the container image creators.
Doesn't seem like a good idea to set this variable. It'll cause more confusion.
Doesn't seem like a good idea to set this variable. It'll cause more confusion.
What kind of confusion? By people that already switched their CI to XDG and it would switch their setup back? Are there any such brave souls? Or in the future, if the cabal manual does not mention the variable prominently enough (does it?)?
What kind of confusion?
This changes the behavior of the action.
What is the interplay with users setting the env var somewhere in the action? Does the action pick it up or overwrite it? Will it now detect the store-dir
properly?
What is the interplay with users not setting the env var in the action?
How do users switch to XDG or away from XDG?
Avoid pre-setting more user-exposed env vars. Have sane defaults. Don't make decisions based on directory existence.
One problem we are facing with the XDG migration is that users might have hardwired the path ~/.cabal/store
into their caching step. Failure of caching does not trigger failure of the action, so users might for a long time not notice that their caching of dependencies is non-functional when we switch this action to XDG based on a pre-installed XDG-cabal.
The goal is thus to force cabal away from XDG.
I don't see a way to achieving this goal (in our multi-cabal context) without setting CABAL_DIR or removing the XDG cabal directories, but I am happy to learn otherwise.
Of course, forcing cabal to use the old directory structure means that we hardcode it here in the action, which is an ugly thing to do.
@hasufell if you have an idea how to do it otherwise, I'd happy to learn, but please be concrete, as I am failing to understand how you envision this to work atm.
One problem we are facing with the XDG migration is that users might have hardwired the path ~/.cabal/store into their caching step. Failure of caching does not trigger failure of the action, so users might for a long time not notice that their caching of dependencies is non-functional when we switch this action to XDG based on a pre-installed XDG-cabal.
I don't think it's this actions responsibility to keep up with every breakage that cabal or ghc cause.
That said, I think the current approach here isn't proper either:
I would:
CABAL_DIR
on windows only (I'm hoping cabal doesn't try to do XDG there)store-dir
(I know cabal doesn't have a proper interface, but it can be done)cabal-store
for their caching instead of hardcoding paths?I'm absolutely fine to handle step 5 and announce anywhere and everywhere I've made a mistake. However, we still need volunteers to handle the other tasks, which includes a design discussion for store-dir
, revisiting the XDG design discussion with the original author and others than took part in it and agreed on the design, the actual implementations, transition plan, etc. I don't think this is going to be ready any time soon.
For now, I'm guessing, haskell-actions still requires a way to work with the current XDG implementation and prevent both the caching problem Andreas mentions and the CI breakage we just experienced. Making sure (adding checks? adding rm -rf
?) the XDG directories are not passed to the user is one way. Setting CABAL_DIR
is another. I don't have an informed opinion on which is better.
Whatever the decision, I don't recommend migrating haskell-actions to XDG in the near future. The original plan was that the details of XDG are subject to change and that we count on feedback from brave users [edit: or new users that don't care, since XDG is the default for new setups] enabling it for their particular projects, not from tools or systems that'd move many users to XDG at once and at unawares. As expected, unforeseen problems emerge and that may not be the end of them.
One way to simplify this is to remove the automatic backwards compatibility code in cabal and always use XDG unless CABAL_DIR is set.
One way to simplify this is to remove the automatic backwards compatibility code in cabal and always use XDG unless CABAL_DIR is set.
This is going to cause even more problems. Please don't.
I suggest setting CABAL_DIR=$HOME/.cabal
. This will make cabal
behave as it did before the XDG changes (including sensitivity to users further changing CABAL_DIR
somewhere in their action). Using XDG can work fine in CI, but carries no advantages.
Please open an issue at https://github.com/actions/runner-images/issues and investigate why XDG is enabled automatically.
It's possible that it has to do with brew doing shenanigans. The actual macOS ghcup provisioning does not enable XDG: https://github.com/actions/runner-images/blob/main/images/macos/scripts/build/install-haskell.sh
Maybe related: https://github.com/actions/runner-images/pull/8778
If something runs cabal update
along the way, that'd explain why the config file is created. I don't see that in those scripts, but I don't really understand how the infrastructure works.
If something runs cabal update along the way
That's likely the installation of yq
then, which depends on pandoc and hence ghc/cabal:
cabal v2-update
: https://github.com/Homebrew/homebrew-core/blob/3ef7d3e13549bdb977d180600b77d2ae02962378/Formula/c/cabal-install.rb#L52edit: well, scratch that, of course ghcup also runs cabal update
: https://github.com/haskell/ghcup-hs/blob/55030d83da6d59196df28c0eb115cdf19ff20393/scripts/bootstrap/bootstrap-haskell#L823
That's true, but Homebrew resets $HOME
and such to a temporary directory when building a formula. It should not have any effect on the user's real home directory and cabal state. It certainly didn't when I (re-)wrote Homebrew's Haskell infrastructure a few years ago, but I no longer have a macOS system to test it. Homebrew's sandboxing during building is quite crude, and there may actually be a risk that the XDG change in cabal broke it.
Edit: oh yeah, that ghcup command is definitely the cause.
oh yeah, that ghcup command is definitely the cause
What is the proper way of disabling XDG during cabal update
(unless it's already enabled)?
Same as any other cabal
command: set CABAL_DIR
or point CABAL_CONFIG
to a config file that sets all directories explicitly.
Same as any other
cabal
command: setCABAL_DIR
or pointCABAL_CONFIG
to a config file that sets all directories explicitly.
Those are both poor options, imo.
I've decided to attempt this: https://github.com/haskell/ghcup-hs/pull/977
If that works correctly, then the github runners will fix themselves in the next deployment. It might make sense to add a test-case though to check that XDG is indeed disabled: https://github.com/actions/runner-images/blob/main/images/macos/scripts/tests/Haskell.Tests.ps1
So it was probably the recommended
tag bump from 3.6.x.x to 3.10.x.x that lead to this whole thing, because previously the bootstrap script would install 3.6 and then run cabal update
with that old version.
That seems like it has the exact same behaviour as setting CABAL_DIR=$HOME/.cabal
. Am I missing something?
That seems like it has the exact same behaviour as setting
CABAL_DIR=$HOME/.cabal
. Am I missing something?
The difference is that I don't set CABAL_DIR
.
https://github.com/haskell/ghcup-hs/pull/977 has been merged and deployed, which means the next time the github runners are deployed, they should pick up the change (which is every 2 weeks or so).
I have now also been bitten by this: https://github.com/haskell/alex/actions/runs/7680499933/job/20932609305?pr=257
Image: macos-12 Version: 20240119.1
These macos runners come with existing XDG directories for cabal. Since the start of the runner, the following files are present:
.config/cabal/config
.cache/cabal/packages/...
In this haskell-actions/setup there is a deliberate choice (decided in haskell/actions#210) to create
~/.cabal/bin
(here) in order to disable XDG, however, cabal will still think it is using XDG because of the existence of~/.config/cabal
. See this logic.I do not know which one should be the path forward:
~/.cabal
and~/.config/cabal/config
exist but not~/.cabal/config
(as is happening here).The 20240105 image is still very new, only 9.79% of rollout progress as I'm writing this, but this still has resulted in CI pipelines failing for example in Cabal itself. 1 2 and more if you look at the
Validate
workflow.