Closed nh2 closed 5 months ago
stack test -v
output leading to the local file changes: README.md ...
:
[debug] Run process: /sbin/ldconfig -p
[debug] Process finished in 2ms: /sbin/ldconfig -p
[debug] Found shared library libtinfo.so.5 in 'ldconfig -p' output
[debug] Did not find shared library libtinfo.so.6
[debug] Did not find shared library libncursesw.so.6
[debug] Found shared library libgmp.so.10 in 'ldconfig -p' output
[debug] Did not find shared library libgmp.so.3
[debug] Potential GHC builds: standard
[debug] Found already installed GHC builds: standard
[debug] SELECT "id","actual_version","arch","ghc_path","ghc_size","ghc_modified","ghc_pkg_path","runghc_path","haddock_path","cabal_version","global_db","global_db_cache_size","global_db_cache_modified","info","global_dump" FROM "compiler_cache" WHERE "ghc_path"=?; [PersistText "/raid/stack/programs/x86_64-linux/ghc-8.6.5/bin/ghc-8.6.5"]
[debug] Loaded compiler information from cache
[debug] Asking for a supported GHC version
[debug] Resolving package entries
[debug] Parsing the targets
[debug] Checking flags
[debug] SourceMap constructed
[debug] Starting to execute command inside EnvConfig
[debug] Finding out which packages are already installed
[debug] Run process: /raid/stack/programs/x86_64-linux/ghc-8.6.5/bin/ghc-pkg-8.6.5 --global --no-user-package-db dump --expand-pkgroot
[debug] Process finished in 27ms: /raid/stack/programs/x86_64-linux/ghc-8.6.5/bin/ghc-pkg-8.6.5 --global --no-user-package-db dump --expand-pkgroot
[debug] Ignoring package Cabal due to wanting version 2.4.1.0 instead of 2.4.0.1
[debug] Ignoring package haskeline due to wanting version 0.7.5.0 instead of 0.7.4.3
[debug] Run process: /raid/stack/programs/x86_64-linux/ghc-8.6.5/bin/ghc-pkg-8.6.5 --user --no-user-package-db --package-db /raid/stack/snapshots/x86_64-linux/37319c6575a7d854bf30c6a8fcfdf1546086e900241daeda1f669d54761755ae/8.6.5/pkgdb dump --expand-pkgroot
[debug] Process finished in 81ms: /raid/stack/programs/x86_64-linux/ghc-8.6.5/bin/ghc-pkg-8.6.5 --user --no-user-package-db --package-db /raid/stack/snapshots/x86_64-linux/37319c6575a7d854bf30c6a8fcfdf1546086e900241daeda1f669d54761755ae/8.6.5/pkgdb dump --expand-pkgroot
[debug] Run process: /raid/stack/programs/x86_64-linux/ghc-8.6.5/bin/ghc-pkg-8.6.5 --user --no-user-package-db --package-db /home/niklas/src/haskell/lz4-frame-conduit/.stack-work/install/x86_64-linux/37319c6575a7d854bf30c6a8fcfdf1546086e900241daeda1f669d54761755ae/8.6.5/pkgdb dump --expand-pkgroot
[debug] Process finished in 19ms: /raid/stack/programs/x86_64-linux/ghc-8.6.5/bin/ghc-pkg-8.6.5 --user --no-user-package-db --package-db /home/niklas/src/haskell/lz4-frame-conduit/.stack-work/install/x86_64-linux/37319c6575a7d854bf30c6a8fcfdf1546086e900241daeda1f669d54761755ae/8.6.5/pkgdb dump --expand-pkgroot
[debug] Constructing the build plan
[debug] SELECT "id","directory","type","pkg_src","active","path_env_var","haddock" FROM "config_cache" WHERE "directory"=? AND "type"=?; [PersistText "/home/niklas/src/haskell/lz4-frame-conduit/.stack-work/install/x86_64-linux/37319c6575a7d854bf30c6a8fcfdf1546086e900241daeda1f669d54761755ae/8.6.5/",PersistText "lib:lz4-frame-conduit-0.1.0.0-8RAA6BcdNm21R7aXlL5jl8"]
[debug] SELECT "id", "config_cache_id", "index", "option" FROM "config_cache_dir_option" WHERE ("config_cache_id"=?) ORDER BY "index"; [PersistInt64 2035]
[debug] SELECT "id", "config_cache_id", "index", "option" FROM "config_cache_no_dir_option" WHERE ("config_cache_id"=?) ORDER BY "index"; [PersistInt64 2035]
[debug] SELECT "id", "config_cache_id", "ghc_pkg_id" FROM "config_cache_dep" WHERE ("config_cache_id"=?); [PersistInt64 2035]
[debug] SELECT "id", "config_cache_id", "component" FROM "config_cache_component" WHERE ("config_cache_id"=?); [PersistInt64 2035]
[debug] Start: getPackageFiles /home/niklas/src/haskell/lz4-frame-conduit/lz4-frame-conduit.cabal
[debug] Finished in 2ms: getPackageFiles /home/niklas/src/haskell/lz4-frame-conduit/lz4-frame-conduit.cabal
[debug] Checking if we are going to build multiple executables with the same name
[debug] Executing the build plan
[info] lz4-frame-conduit-0.1.0.0: unregistering (local file changes: README.md app/Main.hs lz4-frame-conduit.cabal lz4/lib/lz4.c lz4/lib/lz4.h lz4/lib/lz4frame.c lz4/...)
I'll try to repro, but as a first hunch, I guess it's a build cache.
If you look at the .stack-work
folder in your project after building it (e.g. by using stack test
), there are a lot of files that result from past compilations, sorted by Cabal version in the dist
folder and GHC version in the install
folder.
Running stack clean
wipes out the dist
folder. The install
folder remains with its wacky hashes.
This is the output I get from building one of my pet projects:
Linking .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/flattendir-test/flattendir-test ...
Preprocessing executable 'flattendir-exe' for flattendir-0.1.0.0..
Building executable 'flattendir-exe' for flattendir-0.1.0.0..
[1 of 2] Compiling Main
[2 of 2] Compiling Paths_flattendir
Linking .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/flattendir-exe/flattendir-exe ...
flattendir> copy/register
Installing library in [...]/flattendir/.stack-work/install/x86_64-linux/9570ddfa5c53856cf854aff6bc38a117a2b2ccd9c89d93289e368b54d0c78fa9/8.6.5/lib/x86_64-linux-ghc-8.6.5/flattendir-0.1.0.0-HGHxHxXKkpZLfvbfb4LpUM
Installing executable flattendir-exe in [...]/flattendir/.stack-work/install/x86_64-linux/9570ddfa5c53856cf854aff6bc38a117a2b2ccd9c89d93289e368b54d0c78fa9/8.6.5/bin
Registering library for flattendir-0.1.0.0..
The last lines indicate the compilation results are kept in the install
folder, followed by the arch, followed by a hash of some sort, followed by the GHC version, followed by other stuff. This way, it may keep build results generated through both resolvers. Apparently, stack
is reusing the cached stuff instead of building anew, which is desirable.
Now, the issue with your tests is interesting. It should not cache a failed build... maybe? Assuming that's the issue. I'll try to check what's going on.
The thing that the project doesn't get rebuilt is explained by the fact that 2 LTSes give different implicit snapshots and no config change make it possible for them to coexist even for project packages. But probably it's not something intuitive as e.g. switch between a normal build and a test build invalidates project packages. The one thing which clearly needs an investigation is this thing:
# lz4-frame-conduit-0.1.0.0: unregistering (local file changes: README.md app/Main.hs lz4-frame-conduit.cabal lz4/lib/lz4.c lz4/lib/lz4.h lz4/lib/lz4frame.c lz4/...)
# Why is that? None of these files changed.
I started looking into that but had no time yet.
I cloned the repo. Noticed it already had lts-13.26
as resolver.
After running stack test
and switching back and forth between the resolvers, I noticed no difference between the test outputs. I couldn't observe the GHC panic/segfault. I didn't see the unregistering
thing in any of the runs. The tests run with both resolvers with pretty much the same output.
The project relies on C stuff and a *.cabal
file instead of a package.yaml
.
I'm using Ubuntu 18.04.4, stack
2.3.3, and gcc
7.5.0 if that's of some use.
Minor finding: when switching resolvers, the test compresses 1MB ByteString
fails with a hClose: resource vanished (Broken pipe)
message, but when I ran the test again without switching resolvers, the test fails with an expectation mismatch between a large string of *
and an empty string.
@ardamose123 Thanks for letting us know! That's interesting that it didn't unregister
for you.
(On the point of the project's crash, which is unrelated of the stack issue: Yes, resource vanished
is one of the ways that the underlying memory corruption can manifest -- I learned by now that it doesn't necessarily always segfault, see https://gitlab.haskell.org/ghc/ghc/-/issues/18680 for details on that.)
Correction: I just saw the unregistering
line by running stack test -v
. I was not running tests with the -v
flag before. Only happens when switching resolvers. This line doesn't appear if you run the tests again without changing resolvers.
However, as I mentioned before, I can build the project alright and run the tests while switching between resolvers, so it might not be related to the original issue.
@nh2, can you please describe your development environment w/ versions?
I just saw the
unregistering
line by runningstack test -v
@ardamose123 Do you also see local file changes: README.md app/Main.hs lz4-frame-conduit.cabal lz4/lib/lz4.c lz4/lib/lz4.h lz4/lib/lz4frame.c lz4/...
in that case?
Also, when you switch resolvers, does it unregister
each time, or only the first time (as it does for me)?
(For the purpose of this bug, I don't worry much about the test output, as that's either a bug in my library's code or a GHC bug. I do worry about stack's rebuilding/unregistering behaviour though.)
Only first time, just after switching resolvers.
Only first time, just after switching resolvers.
just for clarity, I mean when you switch resolvers forward and backward repeatedly.
Just ran the tests a few more times. I only noticed the unregister
after using stack clean
:
stack clean
resolver: lts-13.26
stack test -v
» unregister
appears.stack test -v
» logs are normal.resolver: lts-14.27
stack test -v
» unregister
appears.stack test -v
» logs are normal.resolver: lts-13.26
stack test -v
» logs are normal.stack test -v
» logs are normal.resolver: lts-14.27
stack test -v
» logs are normal.stack test -v
» logs are normal.stack clean
stack test -v
» unregister
appears.I thought I was getting the unregistering
on every resolver switch, but I probably did a a stack clean
in between my tests earlier.
This behavior LGTM, but I don't know if @nh2 is getting some other behavior.
@nh2 I think I see the reason behind local file changes: README.md app/Main.hs...
- after stack clean
there is no build cache yet and thus Stack reports every package file as dirty, I think we just need to handle this case better and produce some more meaningful message. With that fixed I think there is nothing more to explore in this ticket. Let me know if you see something else unresolved.
@qrilka The main problem was this:
# I switch resolvers back, lts-14.27 -> lts-13.26 stack test # Does not rebuild! No further resolver switching rebuilds.
where the expectation was that it should rebuild because the dependent libraries change.
I don't think we've found a solution/explanation for that yet (or did I miss it?).
@ardamose123 When you say "logs are normal", does it rebuild + relink the executable after switching resolvers?
@nh2 see my answer in https://github.com/commercialhaskell/stack/issues/5381#issuecomment-692904536 - if you don't do stack clean
you'll have 2 versions of your package built for 2 different implicit packages (one for each LTS).
@qrilka Ah sorry, I missed that. I don't quite follow it though. I also don't quite understand @ardamose123's preceding comment about
results are kept in the
install
folder
because I'm doing nothing with install/
, I'm using the contents of the build/
directory.
I still do observe that switching resolvers leaves outdated contents in the build
directory:
Summarised output of the above:
% stack test
Linking .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test ...
2 examples, 0 failures
% sha1sum .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test
9d23a8dd39ccfe86c164d52c03d59f103a904eec .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test
% # switch to lts-13.27
% stack test
Linking .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test ...
2 examples, 0 failures
% sha1sum .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test
c0d644d1d1f9d90e241ba5191bbbf2ca15c3ce67 .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test
% # switch to lts-13.26
% stack test
2 examples, 0 failures
% sha1sum .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test
c0d644d1d1f9d90e241ba5191bbbf2ca15c3ce67 .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test
As shown, I switch resolvers, and then the binary at .stack-work/dist/x86_64-linux/Cabal-2.4.0.1/build/lz4-frame-conduit-test/lz4-frame-conduit-test
is not updated by stack test
(it's the same binary as from before I switched resolvers).
Ouch, I didn't take into account that test executables don't go into snapshot-specific directory... I guess this is just another case of ghc-pkg(?) treating executables and packages differently and Stack got it wrong again... And maybe component-based builds could help here as well? :) Stack sees your library in .stack-work/dist/x86_64-linux/XXX and thus it assumes that nothing needs to be rebuilt. Ok, I will take a look into some way out of this (hopefully without going into component-based builds).
After trying to fix this I came to conclusion that this doesn't look to be easily resolvable with the way stack builds work and the way to fix it properly is to do proper component-based builds as in #4745 The problem with an attempt to fix this seem to break the current way Stack excludes some rebuilds.
I am going to close this issue in favour of #6356, as it appears to be an instance of component-based builds.
When I switch between
lts-14.27
andlts-13.26
repeatedly, stack rebuilds my code only on the first such switch.This is illegitmate because my executable depends on (at least)
conduit-extra
, and that one's version switches between those LTSs.Background / how I found this
I'm debugging an issue with my library:
The project for reproing (on Ubuntu 18.04): https://github.com/nh2/lz4-frame-conduit/tree/4616ec594426de2defda6432eec9cff04692bf7c
Command to run:
When I test after
stack clean
, my bug (a segfault or GHC panic) occurs withand not with
The actual Stack issue
But when I switch between the 2 and run stack test without previous stack clean, most of the time stack doesn't rebuild the code, so then I cannot observe the difference when switching resolvers.
I get this strange behaviour:
So it rebuilds only the first time I switch resolvers (but for apparently the wrong reason), and then never again.