commercialhaskell / stack

The Haskell Tool Stack
http://haskellstack.org
BSD 3-Clause "New" or "Revised" License
4k stars 843 forks source link

Unnecessary rebuilds when stack detects Paths_*.hs file changes #5503

Closed brandon-leapyear closed 7 months ago

brandon-leapyear commented 3 years ago

General summary/comments (optional)

We have a large monorepo with multiple Haskell projects, and we've noticed that Stack recompiles much more often than it needs to. For example, we might run a stack build, which builds A -> B -> C, but then running stack build immediately afterwards would unregister A, forcing the recompilation of B and C. I noticed that the unregistering output mentions the Paths_packageA.hs file changing, which is odd because the stack build commands run immediately after each other.

After doing some digging, I noticed that if Paths_packageA.hs changes during stack build (i.e. gets regenerated), the next stack build compares the new version of Paths_packageA.hs with the version of Paths_packageA.hs before the previous stack build call, not the version of it after.

Steps to reproduce

  1. Clone your favorite Haskell project, or check out one of your own Haskell projects
    • To ensure it's completely fresh, you can run stack purge or git clean -dfx
  2. stack build
  3. For posterity's sake, run cat on .stack-work/dist/*/*/build/*/autogen/Paths_*.hs
  4. Do something to change the snapshot hash
    • e.g. edit extra-deps in stack.yaml
  5. stack build -- this rightfully rebuilds
  6. Again, run cat on the Paths file -- should have been regenerated
  7. stack build

Expected

The last stack build command should not rebuild

Actual

The last stack build command rebuilds, with

unregistering (local file changes: .stack-work/dist/x86_64-osx/Cabal-3.0.1.0/build/*/autogen/Paths_*.hs)

The rebuild might be fast if you're in a repo with only one Haskell package, but in a repo with multiple Haskell packages depending on each other, this would completely recompile everything downstream from the unregistered package EDIT: I guess it doesn't recompile everything downstream, but it unregisters everything and forces everything downstream to rerun the configure + copy/register phase, which is still less than ideal for our system of ~20 packages.

Stack version

$ stack --version
Version 2.5.1, Git revision d6ab861544918185236cf826cb2028abb266d6d5 x86_64 hpack-0.33.0

Method of installation

brandon-leapyear commented 3 years ago

:sparkles: This is an old work account. Please reference @brandonchinn178 for all future communication :sparkles:


Just doing a cursory investigation, it seems like the build cache is saved before the build is run? https://github.com/commercialhaskell/stack/blob/30224e347624f34c9ebbbb444c07339cd123eb4d/src/Stack/Build/Execute.hs#L1573-L1574

Why not do this at the end of realBuild or maybe after/in postBuildCheck?

wraithm commented 3 years ago

I've also experienced this!

Here's a work-around for now. Put this on all of your library, executable, test, and benchmark declarations in your package.yamls.

    when:
      - condition: false
        other-modules: Paths_<package_name>

You need to replace <package_name> with the name of your package, but any dashes must be replaced with underscores. So if your package is named my-package, you'd write: other-modules: Paths_my_package.

awpr commented 3 years ago

Since there's not much activity on this bug, I'll mention that I'm also seeing this happen in practice: the second build of a given configuration always rebuilds from scratch. It happened in this GitHub Actions workflow, for example, with the "Test" step rebuilding unnecessarily after the "Build and Haddock" step built an identical configuration: https://github.com/google/hs-portray/runs/3644246045

awpr commented 3 years ago

A slightly easier repro than the one in the original bug: since adding --test changes the hash, we can use that to trigger the bug:

% stack clean
% stack build # builds from scratch as expected
% stack test # rebuilds as expected because tests were added
% stack test # rebuilds unnecessarily because the last 'stack test' updated Paths_xyz.hs
dten commented 3 years ago

Hey we have this too, but only since upgrading from 2.3.1 to 2.7.1.

I will try bisect the stack repo to find out where this appears

juhp commented 3 years ago

I suspect this also affects stackage builds (very badly)

aryairani commented 2 years ago

This (and maybe other seemingly unnecessary full rebuilds) have been hurting us a lot. We recently estimated 30 seconds of build time per character changed in actual source code.

I'm trying @wraithm's workaround (thank you!) although I hate it because we have a lot of build targets.

andreasabel commented 2 years ago

@dten wrote:

Hey we have this too, but only since upgrading from 2.3.1 to 2.7.1.

I will try bisect the stack repo to find out where this appears

What did bisection find out?

dten commented 2 years ago

Ah I never managed, sorry

mpilgrem commented 7 months ago

I am closing this issue given the passage of time and because I could not reproduce the issue with Stack 2.15.5, following https://github.com/commercialhaskell/stack/issues/5503#issuecomment-922525225:

> stack new test5503
> cd test5503
> stack build # builds (lib + exe), including Paths_test5503
> stack test # builds (lib + exe + test), including Paths_test5503; runs test
> stack test # does not rebuild; runs test