python / cpython

The Python programming language
https://www.python.org/
Other
60.8k stars 29.34k forks source link

GHA: ccache not efficient? #116975

Open pitrou opened 3 months ago

pitrou commented 3 months ago

Bug description:

  1. If you look at the ccache stats in ccache-enabled Github Actions jobs, you'll notice that the cache seems hardly useful most of the time, as there are few hits and many misses.

Example here: https://github.com/python/cpython/actions/runs/8332366775/job/22801352882

ccache stats
  /usr/bin/ccache -s
  Summary:
    Hits:              52 /  341 (15.25 %)
      Direct:          30 /  343 (8.75 %)
      Preprocessed:    22 /  313 (7.03 %)
    Misses:           289
      Direct:         313
      Preprocessed:   291
    Uncacheable:      104
  Primary storage:
    Hits:             387 /  685 (56.50 %)
    Misses:           298
    Cache size (GB): 0.18 / 0.20 (90.22 %)
    Cleanups:           8

Most likely this is because all those builds clash for the same cache keys even though they use different compiler options. The solution would be to use per-build cache keys.

  1. The "Post Configure ccache action" step itself takes ~2 minutes (this seems unexpected), which is as long as it takes to compile CPython. So it's not obvious there is a benefit at all, even if the problem above was fixed.
pitrou commented 3 months ago

@hugovk

hugovk commented 3 months ago

cc @encukou and @erlend-aasland who've also been looking into ccache (https://github.com/python/cpython/issues/113858).

Time in seconds for each step:

Step with ccache without ccache
Set up job 1 1
Run actions/checkout@v4 5 5
Runner image version 0 0
Restore config.cache 0 0
Install Dependencies 18 80
TSAN Option Setup 0 0
Add ccache to PATH 0 0
Configure ccache action 2 n/a
Configure CPython 4  79
Build CPython 97 93
Display build info 5 7
Tests 130 125
Post Configure ccache action 136  n/a
Post Restore config.cache 0 0
Post Run actions/checkout@v4 0 0
Complete job 0 0
     
Total 398 390

The "Install Dependencies" 18s vs 80s looks unrelated, so the total without ccache could be seen here as ~330s instead of 390s?

Let's just look at ccache/configure steps:

So with ccache costs an extra ~63s over without cccache.

pitrou commented 3 months ago

By the way, the slowness of the "Post Configure ccache action" doesn't match our experience on Apache Arrow, where saving a 1GB ccache can take 20 seconds. Example here with the "Post Cache Docker Volumes" step: https://github.com/apache/arrow/actions/runs/8329904897/job/22793369592

Instead of using hendrikmuhs/ccache-action, you might look into using the cache action directly. Setting up ccache is not difficult.