Open zanieb opened 3 months ago
Would it be absurd to introduce "CI" time regression checks in CI like the CodSpeed benches? Unfortunately the GitHub Runner performance is super noisy so it might not work.
Arguably self-hosted runners could possibly help here too, but the maintenance/security/cost burden is likely too large
I strongly considered self-hosted runners, but it seemed painful to orchestrate Windows runners in particular.
I also very much want to look into something like https://github.com/astral-sh/uv/pull/609 again to cache our network traffic — I think that'd help a lot.
I also very much want to look into something like #609 again to cache our network traffic — I think that'd help a lot.
At least for python, maybe worth using something like https://github.com/hauntsaninja/nginx_pypi_cache and saving a copy of the cache as a github cache and reloading it. I actually use it locally a lot to speed up tests. Not sure how it would perform in CI.
That strikes me as a good idea...
Ah that might be easier than using mitmproxy or rolling our own proxy in Rust. Thanks for the link!
edit: I created a published image at https://github.com/astral-sh/nginx_pypi_cache/pkgs/container/nginx_pypi_cache we can use in our jobs if someone wants to trial it in Ubuntu (or locally even, to start)
Some progress with:
Linux is acceptably fast now. We're at the limit for macOS machine size without going to alternative runner providers. There are larger Windows runners, maybe I should test one (#5890). Cost may be a problem at some point, alternative runner providers may be cheaper (often saying things like a 2x cost reduction).
have we considered using https://bazel.build/? (correct me if i'm wrong but cargo test
doesn't seem to skip the tests when none of their dependencies changed)
admittedly adding bazel would significantly increase the complexity though 🙀
Yes, I've heard a lot about complaints regarding bazel's complexity.
Yes, I've heard a lot about complaints regarding bazel's complexity.
this is definitely true - i've seen it improving the CI time drastically at the same time though so maybe something to consider down the line if there's no other option...
Running tests in CI is now >8 minutes (and was previously <1 minute). We've done a lot to optimize this previously, e.g.:
3508
1832
2933
But, CI time is always growing as we add more features and coverage. This is a tracking issue to improve the situation and discuss sources of slowness.
See also:
878