Closed bartdesmet closed 7 months ago
@matus-chochlik This is ready for review now. Let me know your thoughts.
Don't merge the PR yet. I'm still verifying some behaviors here. It looks good now. I missed a self.
in a place, which got silently ignored due to the try
block higher up, and pushed the fix.
Could you please squash the fixes into the relevant commits before merging?
I've rebased the branch now.
Thanks.
Supersedes https://github.com/matus-chochlik/ctcache/pull/54 and implements https://github.com/matus-chochlik/ctcache/issues/53 in a better way.
This PR is still in draft because I'm planning to do more extensive testing over here, but it shows the proposed approach as outlined in the linked issue.I've tested the behavior in our CI system and on local developer machines, and it works as advertised in the linked issue.Some refactorings are done to make the caches a bit more compositional:
ClangTidyMultiCache
implements the cache interface over an array of caches.ClangTidyCacheWithStats
is a decorator around a cache that adds hit/miss counters.ClangTidyCacheStats
is the generalization of the original "local stats".ClangTidyLocalCache
has its statistics logic removed, in favor of the decorator and the generalized stats.With this in place, we can decorate the
ClangTidyLocalCache
but alsoClangTidyMultiCache
to get local/remote hit/miss statistics.We can then construct
ClangTidyCache
to have a_local
and_remote
cache, both of which are optional, and are optionally decorated with the stats logic.Finally, the logic in
ClangTidyCache
now makes the "local before remote" lookup logic explicit, and now has a natural place to implement the write-back behavior where the local cache is updated upon a remote cache hit. I've added a flag to opt-out from this behavior if undesirable.It'd likely make sense to haveI've added a command line flag to return "raw statistics", e.g. in JSON format, in addition to the current human-readable stats output. This will make it easier for other tools to parse, e.g. when doing a hit/miss snapshot before/after a build to output the stats.Update 4/24/2024
I've tested and refined the implementation in this PR. With the
--print-stats
option to dump the statistics in JSON format, and the use ofjq
in some script, I was able to get to feature parity withccache
in terms of reporting statistics of local/remote cache hits/misses inctcache
. An example of the report printed at the end of our build script here is shown below:This is on a developer machine that happens to run the same build flavor of a repository as was already built by a CI machine remotely. The read-only access to secondary/remote caches is then pulling in the binaries through
ccache
and theclang-tidy
result usingctcache
. The write-back feature in this PR makes it such that a subsequent identical build finds local hits, just as is the case ofccache
:For the
--print-stats
reporting and differential analysis between build start and build end, I'm using the following logic usingjq
to get the stats and do some subtractive math inbash
:using
//0
to safeguard against runs where remote or local cache is disabled, and the resulting report omits these keys in the JSON object.FWIW, I found the use of JSON more modern and less error-prone to deal with, e.g. if consuming scripts are using Python as well. With
jq
, it's easy to query in the shell as well, though one could argue a more "raw" format would be preferable. For comparison,ccache
reports stats in a tab-separated format, but this lacks structure compared to e.g. nested JSON objects namedlocal
andremote
. Here's the code I have forccache
's differential analysis: