llvm / circt

Circuit IR Compilers and Tools
https://circt.org
Other
1.57k stars 277 forks source link

[CI] ccache doens't work on nightly integration tests #7171

Closed uenoku closed 2 weeks ago

uenoku commented 2 weeks ago

~It looks like cache-key has date so it will never hit. It seems nightly integration tests use unified build so maybe inevitable?~ https://github.com/llvm/circt/actions/runs/9499212780/job/26179638414

2024-06-13T12:33:13.3427930Z Save cache using key "ccache-nightly-clang-Debug-OFF-OFF-2024-06-13T12:33:13.342Z".
dtzSiFive commented 2 weeks ago

Thanks for reporting, that's strange!

We should be looking up the cache with the date-less prefix which should match?

https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#cache-hits-and-misses .

Quick glance at their code suggests the timestamp is only appended when saving. But it does appear the nightly ccache isn't found / being restored.

The action we're using does support setting our own restore-keys but I don't /think/ that should be necessary?

Kicked off a new run after checking we have the ccache's from nightly available, let's see if they're found . Might just be they're evicted before they're used again.

dtzSiFive commented 2 weeks ago

Looks like we're finding the ccache's, see: https://github.com/llvm/circt/actions/runs/9502641411/job/26191321808#step:4:14 .

And judging by their pace, I think the ccache is working great for at least the Release builds. From one job that just finished (in 6minutes! 2m for the build proper!): cache hit rate 83.28 %: https://github.com/llvm/circt/actions/runs/9502641411/job/26191321808#step:14:13 .

Debug builds do not appear to be working well (let's see the post-build ccache statistics..). Probably our 300M limit is too small for this to be effective.

dtzSiFive commented 2 weeks ago

Looking into this a bit more (numbers are from specific jobs but spot-checked across a few), Release ccache post-build stats report its ccache has numbers like 14172 files in the 300M cache. For Debug build it's much smaller, 1994 .

Looks like for Debug we have 6088 cache misses so yes 1994 entries means even when restored we likely are just constantly evicting things.

The cache statistics for Release configurations suggest it also needs fewer files (compared to the 6k cache miss reported for Debug):

  cache hit (direct)                  2629
  cache hit (preprocessed)             241
  cache miss                           576

Debug builds pulling in more makes sense but just thought that was interesting :). Anyway 14k entries when we need 3-4k tells a better story.


uenoku commented 2 weeks ago

Thank you for taking a look at the details! It does make sense:) Unfortunate that cache doesn't survive a day.

teqdruid commented 2 weeks ago

I also think that the GH cache limit isn't large enough for our builds.

uenoku commented 2 weeks ago

When I looked at https://github.com/llvm/circt/actions/caches at least half of cache space is used by short integration test which is reasonable since short integration test runs on every PR and uses unified builds. I believe we can do something smart not to store cache for short integration tests other than most recent one. But anyway I agree that nightly integration tests don't have any problem with ccache so I'll just close this issue.