haskell / actions

Github actions for Haskell CI
147 stars 54 forks source link

Caching dist-newstyle still rebuilds project tests every time #41

Closed raehik closed 3 years ago

raehik commented 3 years ago

I'm trying to cache ~/.cabal/store and dist-newstyle for my Cabal builds using this action. With that, I expect that if I make a non-code change, Cabal shouldn't have to rebuild anything. But no matter what I try, Cabal will always rebuild the project tests.

One message in the log sticks in my mind:

./.cabal has been changed. Re-configuring with most recently used options. If this fails, please run configure manually.

But I don't know what this means: the *.cabal file isn't edited at all on CI. Also, the library build succeeds just fine, that message only shows on tests.

My tests are at raehik/reprinter (example problematic run workflow). Larger project building at raehik/fortran-src (example problematic run log).

Similar commands run locally don't have this behaviour. I managed to sort most of my issues (Cabal will rebuild the project too if unrelated flags aren't kept the same), but I'd like to know why this occurs. Hope this is relevant, I realise it's a bit of a StackOverflow question -- let me know if I should take it elsewhere.

tfausak commented 3 years ago

It's likely that the rebuild is caused by GHC, not Cabal. The way that GHC determines if it needs to rebuild a file is to compare the source file's timestamp to the compiled file's timestamp. If the source file is newer, it rebuilds. Typically with local workflows this works fine. Unfortunately on CI the first thing that happens is usually cloning the repository, which sets all the source file timestamps to "now". Since all the source files are apparently newer, everything gets rebuilt.

You could solve this problem by manually tracking the timestamps for all your source files. That should work even though it will be a little tedious.

There's an upstream issue against GHC for this behavior: https://gitlab.haskell.org/ghc/ghc/-/issues/16495

raehik commented 3 years ago

Thanks very much @tfausak ! I was especially confused because Stack appeared to work fine/as expected. That issue explains everything, I see you worked on solving it last year.

And your suggestion works perfectly. Putting this snippet at the start of a job:

- uses: actions/checkout@v2
  with:
    fetch-depth: 0
- name: Set all tracked file modification times to the time of their last commit
  run: |
    rev=HEAD
    for f in $(git ls-tree -r -t --full-name --name-only "$rev") ; do
        touch -d $(git log --pretty=format:%cI -1 "$rev" -- "$f") "$f";
    done

appears to get me the caching behaviour I was hoping for. Thanks again for your help. If you think any of this is useful/common enough to add as a note in this repo, I'd gladly do that. I'll close this issue for now.

andreasabel commented 2 years ago

This code snippet isn't portable. MacOS touch does not accept option -d.

tfausak commented 2 years ago

Works fine for me on macOS Monterey:

$ uname -a
Darwin TayMini.local 21.4.0 Darwin Kernel Version 21.4.0: Fri Mar 18 00:47:26 PDT 2022; root:xnu-8020.101.4~15/RELEASE_ARM64_T8101 arm64

$ touch some-file

$ ls -l
total 0
-rw-r--r--  1 taylor  staff  0 May  7 07:18 some-file

$ touch -d 2001-02-03T04:05:06 some-file

$ ls -l
total 0
-rw-r--r--  1 taylor  staff  0 Feb  3  2001 some-file
andreasabel commented 2 years ago

That may be, but it fails on GHA with macOS-latest: https://github.com/agda/agda/runs/6333826205?check_suite_focus=true Here is a portable version:

      for f in $(git ls-tree -r -t --full-name --name-only HEAD) ; do
          touch -t $(git log -1 --pretty=format:%cd --date=format:%Y%m%d%H%M.%S HEAD -- "$f") "$f";
      done

See it in action here: https://github.com/agda/agda/runs/6333962780?check_suite_focus=true

A flaw in the ointment that both steps

  1. full-depth checkout
  2. touch mtime

take considerable time themselves that makes the trick less worthwhile. E.g. on macos-latest:

  1. 1m 54s (20000 commits)
  2. 2m 9s (480 files)

even though I restricted step 2. to only the ~480 files in src/full (rather than the whole 8500 files in the repo). The other OSs behave better: Ubuntu (50-60s for 1.+2.), Windows (60s + 30s).

So, the trick has to be taken with a grain of salt...

raehik commented 2 years ago

I have this commented out on my Mac workflows, same problem @andreasabel had with macos touch having different flags. Also, I don't think I recognized the meaning of fetch-depth: 0 when I wrote it, so thank you. I saw a GHC PR Big driver refactor (!5661) that included hash-based recompilation work get merged a while back. But I don't know if it was released in 9.2, or if it might be in 9.4, or if it extends to Cabal workflows.

andreasabel commented 2 years ago

I have this commented out on my Mac workflows, same problem @andreasabel had with macos touch having different flags.

Yes, but the -t option seems portable, it worked for me on all three platforms.

I saw a GHC PR Big driver refactor (!5661) that included hash-based recompilation work get merged a while back. But I don't know if it was released in 9.2, or if it might be in 9.4, or if it extends to Cabal workflows.

Indeed, ideally this is fixed upstreams. But I think this hasn't been released with 9.2.

hdgarrood commented 2 years ago

This is indeed fixed in GHC 9.4! (But not in 9.2, as you noted.)

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.4.2
harry@sunbird.local: ~/Code/ghc-scratch 
$ ghc --make M3.hs
[1 of 3] Compiling M1               ( M1.hs, M1.o )
[2 of 3] Compiling M2               ( M2.hs, M2.o )
[3 of 3] Compiling M3               ( M3.hs, M3.o )
harry@sunbird.local: ~/Code/ghc-scratch 
$ touch M3.hs
harry@sunbird.local: ~/Code/ghc-scratch 
$ ghc --make M3.hs