Closed hseg closed 6 months ago
(This should go without saying, but this workflow used to work as late as two weeks ago with stack-static 2.15.1
. At least it's good to know the workaround of explicitly calling stack setup
works.)
@hseg, thanks for reporting.
On Windows 11, with Stack 2.15.3, first run of stack build
outside of a project directory:
config.yaml
in the Stack rootglobal-project
in the Stack root (which is populated with lts-22.11
due to https://github.com/commercialhaskell/stack/issues/6516)On the Ubuntu distribution of Linux (via WSL), Stack 2.15.3 does the same.
Do you get the same problem on Arch Linux with an 'official' build of Stack 2.15.3 for Linux?
On Sat, Mar 16, 2024 at 03:50:31PM -0700, Mike Pilgrem wrote:
@hseg, thanks for reporting.
On Windows 11, with Stack 2.15.3, first run of
stack build
outside of a project directory:
- creates the missing
config.yaml
in the Stack root- creates the missing
global-project
in the Stack root (which is populated withlts-22.11
due to https://github.com/commercialhaskell/stack/issues/6516)- fetches the requested version of GHC
- correctly, throws error [S-8506] (no target)
On the Ubuntu distribution of Linux (via WSL), Stack 2.15.3 does the same.
Do you get the same problem on Arch Linux with an 'official' build of Stack 2.15.3 for Linux?
The official stack build for Arch Linux is on 2.9.1 and to test it would require rebuilding 67 dependencies. Can try later this week.
Though an important note -- the bug does not occur with a fresh project, but
rather with a fresh stack install -- i.e. $XDG_DATA_HOME/stack empty, so that
stack build
needs to fetch GHC. Though by your dscription you might be
exercising the same code path, so this point might be moot.
By 'official', I meant the binary distributions provided via this repository - but I would include the ones provided via GHCup too. Understood on a 'fresh install' - I think I deleted enough of my Stack root to mimic that (I deleted the entire Stack root on Ubuntu).
On Sun, Mar 17, 2024 at 04:49:50AM -0700, Mike Pilgrem wrote:
By 'official', I meant the binary distributions provided via this repository - but I would include the ones provided via GHCup too. Understood on a 'fresh install' - I think I deleted enough of my Stack root to mimic that (I deleted the entire Stack root on Ubuntu).
So testing with a ghcup-vendored stack, I reproduce your lack of errors, so presumably it's the github releases that are at fault. Indeed, they differ:
$ file ghcup/stack-2.15.3-linux-x86_64/stack
ghcup/stack-2.15.3-linux-x86_64/stack: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=491a5805a6dd0316cc949ec155d3e4cd2b5f269d, stripped
$ file pacman/stack-2.15.3-linux-x86_64/stack
pacman/stack-2.15.3-linux-x86_64/stack: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=27d126ba061fdc227fccd6ae92feb90715899073, stripped
(ghcup pulls https://downloads.haskell.org/~ghcup/unofficial-bindists/stack/2.15.3/stack-2.15.3-linux-x86_64.tar.gz while the PKGBUILD I installed pulls https://github.com/commercialhaskell/stack/releases/download/v2.15.3/stack-2.15.3-linux-x86_64.tar.gz )
However, for some reason I can't reproduce with my locally-packaged stack either right now. So either this was a network/infra error that's been fixed, or it's due to me testing this at uni today -- will check again when I come home later today.
That pacman/stack-2.15.3-linux-x86_64/stack
is indeed the same as the 'official' Stack 2.15.3:
$ file /home/mpilgrem/.local/bin/stack
/home/mpilgrem/.local/bin/stack: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=27d126ba061fdc227fccd6ae92feb90715899073, stripped
On Sun, Mar 17, 2024 at 10:48:58AM -0700, Mike Pilgrem wrote:
That
pacman/stack-2.15.3-linux-x86_64/stack
is indeed the same as the 'official' Stack 2.15.3:$ file /home/mpilgrem/.local/bin/stack /home/mpilgrem/.local/bin/stack: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=27d126ba061fdc227fccd6ae92feb90715899073, stripped
Further testing suggests this is due to a poor interaction between stack
and
makepkg
-- could only reproduce when running stack
under makepkg
.
Going through the logs, suspected the {C,CXX,LD,MAKE}FLAGS
that makepkg
sets
-- these might interfere with stack
operation.
Their values are:
DEBUGFLAGS="-g -ffile-prefix-map=/tmp/src=/usr/src/debug/test-stack -flto=auto"
CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions \
-Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security \
-fstack-clash-protection -fcf-protection \
-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer \
$DEBUGFLAGS"
CXXFLAGS="$CFLAGS -Wp,-D_GLIBCXX_ASSERTIONS $DEBUGFLAGS"
LDFLAGS="-Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now \
-Wl,-z,pack-relative-relocs -flto=auto"
MAKEFLAGS="-j4"
And indeed, (see stack-build-envvar.log
), setting these variables makes
stack
bork. Am out of time and spoons for tonight to bisect these assignments
to see which one is to blame, hope these help.
(can set these flags to different values, just need to know which ones to kick)
EDIT: Github disallows updating email replies to markdown format even after the fact, logs attached to comment below.
Github really doesn't like email responses, does it? Cleaned up previous comment, sorry for the mess in your inboxes. The following are the files I attempted to attach to the previous comment: makepkg.log stack-build-envvars.log stack-build.log
And since Github doesn't like files without filetypes:
pkgname=test-stack
pkgver=1
pkgrel=1
pkgdesc='Testing running stack in PKGBUILD'
arch=('x86_64')
url='https://github.com/commercialhaskell/stack'
license=('CC0')
makedepends=('stack')
source=()
prepare() {
#stack setup
:
}
build() {
stack build --verbose
}
package() {
:
}
I am not familiar with makepkg
. In the failure log, this line during GHC's configuration process seems important:
2024-03-18 00:35:55.435076: [error] configure: error: Failed to determine machine word size. Does your toolchain actually work?
It seems that something disables GHC's configuration on installation. That is, if this is an issue, it currently seems to me to be at the level of GHC rather than Stack.
On Sun, Mar 17, 2024 at 04:12:02PM -0700, Mike Pilgrem wrote:
I am not familiar with
makepkg
. In the failure log, this line during GHC's configuration process seems important:2024-03-18 00:35:55.435076: [error] configure: error: Failed to determine machine word size. Does your toolchain actually work?
It seems that something disables GHC's configuration on installation. That is, if this is an issue, it currently seems to me to be at the level of GHC rather than Stack.
Hence my minimizing of the bug to being due to the {C,CXX,LD,MAKE}FLAGS
environment variables -- running stack build --verbose
with these environment
variables set to the values I posted in
https://github.com/commercialhaskell/stack/issues/6525#issuecomment-2002643812
reproduces the failure to install GHC I originally reported, without needing
anything Arch-specific as far as I can tell.
(These are the default settings Arch uses to build packages it distributes,
hence why they were set when running stack
under makepkg
)
Stack installs GHC by, essentially, following programmatically the manual install instructions in the INSTALL
file provided with the GHC binary distribution: https://downloads.haskell.org/~ghc/9.6.4/ghc-9.6.4-x86_64-fedora33-linux.tar.xz. Stack itself does not pay any attention to any of those environment variables.
I searched GHC's issues for makepkg
but did not identify an existing issue: https://gitlab.haskell.org/search?group_id=2&scope=issues&search=makepkg.
As you understand better than me what is makepkg
and how it might be affecting adversely GHC's configure
script, you are probably better placed than me to raise a GHC issue.
Pinging @hasufell (an expert in installing GHC on various Linux distributions) in case he can provide any insight.
OK, so I minimized the breaking situation:
export LDFLAGS='-Wl,-z,pack-relative-relocs'
stack build --verbose --stack-root "$(mktemp -d)"
Relevant Arch RFC to provide context: https://rfc.archlinux.page/0023-pack-relative-relocs/
Logs attached, though presumably Github will reproduce them below instead.
@hseg, that stack.log
file seems to be 0 bytes in size. Can you re-supply it?
@hseg, also - to rule things out - noting "supported since glibc 2.36, GNU Binutils 2.38 and LLVM 15": can you confirm your system has all of those pre-requisities?
Found on the Internet - by analogy, could it be LLVM-related:
Extracts from GHC's configure
script:
# ac_fn_c_compute_int LINENO EXPR VAR INCLUDES
# --------------------------------------------
# Tries to find the compile-time value of EXPR in a program that includes
# INCLUDES, setting VAR accordingly. Returns whether the value could be
# computed
ac_fn_c_compute_int ()
{
...
# The cast to long int works around a bug in the HP C Compiler
# version HP92453-01 B.11.11.23709.GP, which incorrectly rejects
# declarations like `int a3[[(sizeof (unsigned char)) >= 0]];'.
# This bug is HP SR number 8606223364.
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking size of void *" >&5
$as_echo_n "checking size of void *... " >&6; }
if ${ac_cv_sizeof_void_p+:} false; then :
$as_echo_n "(cached) " >&6
else
if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (void *))" "ac_cv_sizeof_void_p" "$ac_includes_default"; then :
else
if test "$ac_cv_type_void_p" = yes; then
{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
as_fn_error 77 "cannot compute sizeof (void *)
See \`config.log' for more details" "$LINENO" 5; }
else
ac_cv_sizeof_void_p=0
fi
fi
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_sizeof_void_p" >&5
$as_echo "$ac_cv_sizeof_void_p" >&6; }
cat >>confdefs.h <<_ACEOF
#define SIZEOF_VOID_P $ac_cv_sizeof_void_p
_ACEOF
if test "x$ac_cv_sizeof_void_p" = "x0"; then
as_fn_error $? "Failed to determine machine word size. Does your toolchain actually work?" "$LINENO" 5
fi
Oops. Another attempt at attaching the log stack.log As for the package versions:
$ pacman -Q binutils glibc llvm-libs
binutils 2.42-2
glibc 2.39-1
llvm-libs 17.0.6-2
Don't have llvm
itself installed, just the runtime libraries. Unsure what's going on there.
Indeed, tracing the ./configure
output, it appears it's invoking GCC, so that bit isn't exotic.
Found https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/merge_requests/6#note_171460, will report there later today, see what they have to say
Archlinux is a clusterf*ck. It has been one of the worst distributions to use Haskell on. First they forced dynamic linking via their PKGBUILDs, causing so much trouble with cabal and other toolings. They really have no idea what they are doing.
We're already warning users about arch being broken crap: https://www.haskell.org/downloads/
Do not use the Haskell development tools provided by Arch, they are broken. For more information see [1] [2].
This just reinforces it. Don't use arch. They don't know what they are doing.
Wrt the LDFLAGs... I asked some people knowledgeable about linking and they think it's bonkers to force pack-relative-relocs
.
Reported this at Arch as well, here's hoping we find a solution better than "burn it all down" https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/merge_requests/6#note_171667
Though the brokenness of Haskell on Arch is indeed why I build all my Haskell programs statically, relying on stack
for dependency resolution rather than
pacman
.
I am going to close this issue, from Stack's perspective, as it seems to be, squarely, 'upstream'. The discussion and links here should be of help to other Arch Linux users who encounter it.
Reported this at Arch as well, here's hoping we find a solution better than "burn it all down" https://gitlab.archlinux.org/archlinux/packaging/packages/pacman/-/merge_requests/6#note_171667
Though the brokenness of Haskell on Arch is indeed why I build all my Haskell programs statically, relying on
stack
for dependency resolution rather thanpacman
.
I still think this should also be raised as a GHC issue.
Fair enough, raised: https://gitlab.haskell.org/ghc/ghc/-/issues/24565
Another tangent that could be explored here -- why is it that stack build
with these LDFLAGS
fails, but stack setup && stack build
succeeds?
On that question, the answer seems to be that the environment was set differently in each case (which you can see in the logs in the line [debug] menv = fromList ...
).
If we take the log files above in order:
stack-bad.log
: LDFLAGS
not set -> GHC configuresstack-good.log
: LDFLAGS
not set -> GHC configuresmakepkg.log
: LDFLAGS
set -> GHC does not configurestack-build-envvars.og
: LDFLAGS
set -> GHC does not configurestack-build.log
: LDFLAGS
not set -> GHC configuresstack.log
: LDFLAGS
set -> GHC does not configureThe logs above do not, in fact, include a case where (a) LDFLAGS
was set and (b) stack setup
succeeded.
On Tue, Mar 19, 2024 at 11:18:24AM -0700, Mike Pilgrem wrote:
On that question, the answer seems to be that the environment was set differently in each case (which you can see in the logs in the line
[debug] menv = fromList ...
).If we take the log files above in order:
stack-bad.log
:LDFLAGS
not set -> GHC configuresstack-good.log
:LDFLAGS
not set -> GHC configuresmakepkg.log
:LDFLAGS
set -> GHC does not configurestack-build-envvars.og
:LDFLAGS
set -> GHC does not configurestack-build.log
:LDFLAGS
not set -> GHC configuresstack.log
:LDFLAGS
set -> GHC does not configureThe logs above do not, in fact, include a case where (a)
LDFLAGS
was set and (b)stack setup
succeeded.
Hrm. Indeed, testing
export LDFLAGS='-Wl,-z,pack-relative-relocs'
stack setup --stack-root "$(mktemp -d)" --verbose
I indeed obtain the Failed to determine machine word size
error.
A little investigating later, it appears that what I was noticing was intended
makepkg
behaviour -- it sets LDFLAGS
as non-exported in prepare()
(which
is where I was running stack setup
), but as exported in build()
(which is
where I was running stack build
.
And indeed, moving the stack setup
to build()
reproduces the error.
So our attribution of the error to the LDFLAGS
setting is correct, and
makepkg
just muddied the waters.
In reraising this for Arch, found the root cause -- ghc
and other packages were building with LD=ld.gold
, which doesn't support these LDFLAGS
. GHC is now testing $LD $LDFLAGS
works before settling on that choice of LD
. Might this be an idea for stack
to implement as a sanity check? Suggested this for cabal
as well: https://github.com/haskell/cabal/issues/9828
Yes, the choice of gold was an odd one. GHCup is already forcing ld.bfd on alpine, because gold is causing problems.
GHC 9.6.5, now released, includes "Ensuring we take LDFLAGS into account when configuring a linker (#24565)."
General summary/comments
On a fresh system (
$XDG_DATA_HOME/stack
empty), a barestack build
fails to build, erroring out atInstalling GHC
. However,stack setup && stack build
succeeds. This is counter to my understanding ofstack build
's intended behaviour when it fails to find an installedghc
.Steps to reproduce
Expected
stack build
notices noghc
is installed, invokesstack setup
, then proceeds with build with the installedghc
.Actual
Logs attached (ignore the S-8506 error in the good log, that is due to the testing directory not containing a
stack.yaml
file): stack-bad.log stack-good.logStack version
Method of installation
https://aur.archlinux.org/packages/stack-static, patched to install 2.15.3 binary release.
Platform
Arch Linux 6.7.9