Closed L0neGamer closed 6 months ago
@L0neGamer is it possible to reproduce the issue with other versions of GHC, newer than 9.2.8?
We have a CI job for Windows + GHC 9.2.8 which seems to succeed, so I'm at loss what's up.
Sorry I wasn't clear, when I said 9.2.8+ I meant that version and onwards. Also tested on .4.8 and .6.4
That's very weird. Can you contribute a reproducer expressed as a CI job?
Also, what's the Cabal version you are using?
cabal --version
-> 3.10.2.1
I'm not sure how I'd do the CI job thing but I can try look into it? It'd probably be best for someone else to though.
Looking at the CI jobs, the only two relating to windows I can immediately see is one that builds and one that runs with bundled-c-zlib enabled, which is likely the issue here.
Can confirm that running cabal run -c 'zlib +bundled-c-zlib'
results in the correct behaviour (that is, Test
prints).
Well, but the job without bundled-c-zlib
also succeeds in CI environment, right? If it runs tests, it means that it linked successfully.
True. I don't know enough how this stuff works or what the windows environment looks like; if you've reading material or a suggestion of where to read up I can have a go at some stage.
The thing is that zlib
links fine on a Windows machine I have access to. So I cannot investigate any further without a portable reproducer.
It might be worth to raise the issue at https://gitlab.haskell.org/ghc/ghc/-/issues: it's GHC's responsibility to link correctly (or abort compilation if it's impossible to do so).
I'll look into raising it over there soon; at the very least maybe I'll be able to get a reproducer for here from them.
I was also bitten by this on my windows 10 machine. I was able to reproduce the issue while building cabal HEAD. GHC 9.4.8 and cabal 3.10.2.1.
@fendor please give me a reproducer in a form of CI job.
No windows runner supported by github (it is just windows-2019 and windows-2022) seems to be able to reproduce the issue right now.
@fendor you can also try flipping pkg-config
flag: I suspect GHA runners are likely to have it pre-installed, but your local environment probably does not.
Otherwise file a GHC issue please.
With the pkg-config flag:
$ cabal repl exes --constraint="zlib +pkg-config" Resolving dependencies...
Error: cabal-3.10.2.1.exe: Could not resolve dependencies:
[__0] trying: zlib-ghc-windows-0.1 (user goal)
[__1] trying: zlib-0.7.0.0 (dependency of zlib-ghc-windows)
[__2] trying: zlib:-bundled-c-zlib
[__3] rejecting: zlib:+pkg-config (conflict: pkg-config package zlib-any, not
found in the pkg-config database)
[__3] rejecting: zlib:-pkg-config (constraint from command line flag requires
opposite flag selection)
[__3] fail (backjumping, conflict set: zlib, zlib:bundled-c-zlib,
zlib:pkg-config)
After searching the rest of the dependency tree exhaustively, these were the
goals I've had most trouble fulfilling: zlib, zlib-ghc-windows,
zlib:bundled-c-zlib, zlib:pkg-config
Try running with --minimize-conflict-set to improve the error message.
I will file a ghc issue either way.
@fendor I think ultimately it's either Cabal or GHC responsibility: if extra-libraries: z
is not available or is no good to link with, they should tell so loudly instead of producing segfaulting artefacts.
I agree. I am looking into it a little bit.
Tracking this issue in ghc: https://gitlab.haskell.org/ghc/ghc/-/issues/24531
From recollection, MSYS2 does not come with pkg-config.exe
by default and you have to manually install https://packages.msys2.org/package/mingw-w64-x86_64-pkgconf. EDIT: Recollection confirmed with a fresh Stack-supplied MSYS2:
❯ stack exec -- where.exe pkg-config
INFO: Could not find files for the given pattern(s).
@andreasabel I think that one is an orthogonal, Stack-specific issue, not quite related to the error here (which is that pkg-config
exists, advertises zlib
C library as available,zlib
C library is advertised as available, but linking fails eventually).
@Bodigrim, this may be off-topic for this particular issue (EDIT: perhaps on topic for https://github.com/haskell/zlib/issues/64), but why has zlib-0.7
chosen to make the default for its Cabal flag pkg-config
true
on Windows? If I set the flag to false
, zlib-0.7
works fine 'out of the box' on Windows.
The problem I have is: if I have a dependency on zlib
(as I do in stack.cabal
), and I am using Windows, how do I specify that its pkg-config
Cabal flag needs to be set to false
? I don't think you can do that with Cabal, and Stack's flags
configuration option is not conditional on operating system. Is the only solution to set the pkg-config
flag to false
for all operating systems (EDIT: that is, using Stack's flags
configuration option)?
The problem I have is: if I have a dependency on
zlib
(as I do instack.cabal
), and I am using Windows, how do I specify that itspkg-config
Cabal flag needs to be set tofalse
? I don't think you can do that with Cabal, and Stack'sflags
configuration option is not conditional on operating system.
pkg-config
is an automatic flag and Cabal is happy to solve it depending on environment, so normally there is nothing to specify. Even if it was not automatic, cabal.project
supports conditions based on OS.
As I said https://github.com/commercialhaskell/stack/issues/6557, Stackage snapshots should set pkg-config
to false
uniformly, yes.
As noted in https://gitlab.haskell.org/ghc/ghc/-/issues/24531#note_559785, the situation on Windows is a little complicated. GHC always links against <ghc-install-dir>/mingw/lib
, as this contains libraries that are needed for GHC's RTS (among other things). However, this library also contains libz.dll.a
, an import library that tells GHC to dynamically load the zlib1.dll
shared library at runtime. As far as the linker is concerned, the presence of libz.dll.a
at link time means that everything is working as expected.
Where things go wrong is when you actually run the executable. Due to how dynamic linking works on Windows, the loader can't know ahead of time where zlib1.dll
is (there are no rpath
s on Windows), so the loader instead searches your PATH
for zlib1.dll
. There is a zlib1.dll
file located in <ghc-install-dir>/mingw/bin
, but most users won't have that on their PATH
(and it's unclear if that would be advisable in general). Therefore, the executable will fail at runtime when it can't find zlib1.dll
.
Many GHC users also have MinGW-w64 installed (via MSYS2), and when you run something in an MSYS2 shell, it will add a directory to your PATH
that contains another copy of zlib1.dll
. As such, this issue may not occur for you locally if you are running in MSYS2. If that is the case, try running the same commands in PowerShell (and make sure that you didn't add any MSYS2 directories to your PATH
).
Having said all of that, it's unclear to me what can be done about this on the GHC side. I am not a Windows GHC expert, so I presume that there is a good reason for including libz.dll.a
in <ghc-install-dir>/mingw/lib
, but it does have the unfortunate side effect of messing with .cabal
files that depend on extra-libraries: z
.
A workaround would be to compile zlib
using the bundled-c-zlib
or pkg-config
flags. I wonder if bundled-c-zlib
should be the default on Windows until we figure out how to resolve https://gitlab.haskell.org/ghc/ghc/-/issues/24531.
However, this library also contains
libz.dll.a
, an import library that tells GHC to dynamically load thezlib1.dll
shared library at runtime.
@RyanGlScott is there any way to force static linking? Or is libz.dll.a
only dynamically-linkable?
it does have the unfortunate side effect of messing with
.cabal
files that depend onextra-libraries: z
.
Is my understanding correct that we can never trust extra-libraries: z
, because we do not know whether it is a static or dynamic library?
is there any way to force static linking?
In principle, yes, although I haven't managed to figure out its quirks. GHC accepts the -l:libXYZ.a
syntax, which instructs the linker to link against a specific file. With this, you can tell GHC to link against libz.a
(a static archive) instead of defaulting to libz.dll.a
import library (which is what would happen if you passed -lz
).
That being said, this appears to be somewhat buggy in practice. I tried modifying zlib.cabal
like so:
diff --git a/zlib.cabal b/zlib.cabal
index 24e2595..22aff8b 100644
--- a/zlib.cabal
+++ b/zlib.cabal
@@ -118,7 +118,7 @@ library
pkgconfig-depends: zlib
else
-- On Windows zlib is shipped with GHC starting from 7.10
- extra-libraries: z
+ extra-libraries: :libz.a
test-suite tests
type: exitcode-stdio-1.0
But that fails with a different linker error when building the executable:
Building executable 'example' for zlib-ghc-windows-0.1..
[2 of 2] Linking C:\\Users\\winferno\\Documents\\Hacking\\Haskell\\zlib-ghc-windows-65\\dist-newstyle\\build\\x86_64-windows\\ghc-9.4.8\\zlib-ghc-windows-0.1\\x\\example\\build\\example\\example.exe
ld.lld: warning: ignoring unknown argument: -exclude-symbols:zcalloc
ld.lld: warning: ignoring unknown argument: -exclude-symbols:zcfree
ld.lld: error: -exclude-symbols:zcalloc is not allowed in .drectve
ld.lld: error: -exclude-symbols:zcfree is not allowed in .drectve
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_tr_init
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_tr_stored_block
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_tr_flush_bits
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_tr_align
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_tr_flush_block
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_tr_tally
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_dist_code
ld.lld: warning: ignoring unknown argument: -exclude-symbols:_length_code
ld.lld: error: -exclude-symbols:_tr_init is not allowed in .drectve
ld.lld: error: -exclude-symbols:_tr_stored_block is not allowed in .drectve
ld.lld: error: -exclude-symbols:_tr_flush_bits is not allowed in .drectve
ld.lld: error: -exclude-symbols:_tr_align is not allowed in .drectve
ld.lld: error: -exclude-symbols:_tr_flush_block is not allowed in .drectve
ld.lld: error: -exclude-symbols:_tr_tally is not allowed in .drectve
ld.lld: error: -exclude-symbols:_dist_code is not allowed in .drectve
ld.lld: error: -exclude-symbols:_length_code is not allowed in .drectve
ld.lld: warning: ignoring unknown argument: -exclude-symbols:inflate_table
ld.lld: error: -exclude-symbols:inflate_table is not allowed in .drectve
ld.lld: warning: ignoring unknown argument: -exclude-symbols:inflate_fast
ld.lld: error: -exclude-symbols:inflate_fast is not allowed in .drectve
clang: error: linker command failed with exit code 1 (use -v to see invocation)
ghc-9.4.8.exe: `clang.exe' failed in phase `Linker'. (Exit code: 1)
Is my understanding correct that we can never trust
extra-libraries: z
, because we do not know whether it is a static or dynamic library?
The issue isn't really static vs. dynamic libraries, but rather dynamic libraries that are on your runtime search path (e.g., MinGW-w64 libraries) versus ones that aren't (e.g., libraries that are bundled with GHC). Using a dynamically linked libz
is perfectly fine provided that the dyanamic loader knows where it is at runtime, and this is precisely why the pkg-config
option works most of the time.
On Windows, in the Stack environment, I think the GHC-supplied zlib1.dll
is always on the PATH (and first on the PATH). For example, on my system:
❯ stack --snapshot ghc-9.6.5 exec -- where.exe zlib*
C:\Users\mike\AppData\Local\Programs\stack\x86_64-windows\ghc-9.6.5\mingw\bin\zlib1.dll
C:\Program Files\gnuplot\bin\zlib1.dll
C:\Program Files (x86)\gnupg\bin\zlib1.dll
C:\Program Files\Inkscape\bin\zlib1.dll
As indicated above, a number of applications that I use put a copy of zlib1.dll
on the PATH. In the past, outside of the Stack environment, I have had problems with Haskell code picking up an out-of-date version of zlib1.dll
on the PATH (fixed by replacing it with an up-to-date version).
Thanks for the investigation @RyanGlScott!
To observe the issue, do the following on Windows (having installed GHC 9.2.8+).
Have an
example.cabal
with the following contents:Have a file
Main.hs
with the following contents:run
cabal build
, thencabal exec example
. For some reason,Test
is not printed to stdout. Runningecho $lastexitcode
shows that the exit code given is-1073741701
, which from a quick google is typically related to incorrect linkings. Note that this is a runtime failure, not a build failure.Changing the zlib version to 0.6.3.0 (which is the previous version) means that this program works.
This is probably related to
Do not force bundled-c-zlib on Windows, but force it for WASM.
in the previous release, if I had to guess.This error arose when similar code was written using a library massively downstream of zlib (
discord-haskell
, with code as below). This is even more surprising, since I'm pretty sure thatrestCall
shouldn't directly referencecompress
or similar valuesOther notes include is that my Windows haskell setup is entirely fresh and made specifically to test this out, so it's unlikely to be an issue with my machine (also considering that someone else brought this issue to me).