bytebeamio / rumqtt

The MQTT ecosystem in rust
Apache License 2.0
1.64k stars 252 forks source link

ci(actions): prevent "out of space" error on github actions ubuntu image #804

Closed arunanshub closed 8 months ago

arunanshub commented 9 months ago

Type of change

Checklist:

This is a CI change.

swanandx commented 9 months ago

we do cache the build artifacts using Swatinem/rust-cache@v2 , and it seems working fine.

coveralls commented 9 months ago

Pull Request Test Coverage Report for Build 7986752372

Details


Totals Coverage Status
Change from base Build 7986144141: 37.8%
Covered Lines: 6255
Relevant Lines: 16534

💛 - Coveralls
arunanshub commented 9 months ago

sscache is different from rust-cache.

sccache is a ccache-like compiler caching tool. It is used as a compiler wrapper and avoids compilation when possible, storing cached results either on local disk or in one of several cloud storage backends.

swanandx commented 9 months ago

sscache is different from rust-cache.

how will it benefit us differently from rust-cache here?

arunanshub commented 9 months ago

Check here: https://www.reddit.com/r/rust/comments/rvqxkf/does_sccache_really_help/

I am still investigating regarding this. If you have doubts regarding concepts, I'd prefer a private DM.

swanandx commented 9 months ago

from thread you quoted: this or by matklad

I don't see any benefit sccache provides to us as well. we already have rust-cache setup.

I'd prefer a private DM.

sure, feel free to DM if required.

arunanshub commented 9 months ago

Observations and ideas

  1. Unfortunately, almost nobody does this. A typical example would just cache the whole of ./target directory. That’s wrong — the ./target is huge, and most of it is useless on CI.

    Caching entire target directory provides no benefit.

  2. How does sscache compare to rust-cache?

    I personally don’t use sccache, as I didn’t find it to actually meaningfully improve compile times for me over what Cargo & CI caching natively do for my use-cases. Though, that was more than a couple of years ago. matklad on Reddit, Faster rust builds

  3. Since this only happens on Ubuntu, is it tied to a linker issue?

[!NOTE] Using the mold linker cut the build time by half on ubuntu.

Even folks at astral-sh/ruff moved to mold linker (and nextest). https://github.com/astral-sh/ruff/pull/9921

  1. Excluding debuginfo should save space.

  2. Can we split up the build steps into their own groups? Eg, check/lint and tests in their own group?

  3. Benefits of cargo hack --each-feature over --all-features. In other words, do we really need to build for each feature separately?

  4. cargo hack doc --verbose --no-deps --each-feature --no-dev-deps --optional-deps url -p rumqttc -p rumqttd

    If our only goal is to check the docs, its already done by doctests. We don't need to produce documentation for the entire workspace along with its dependencies.

swanandx commented 9 months ago

Though, that was more than a couple of years ago.

quoted em as you told me to check there

Faster rust builds

this just confirms that rust-cache is good enough, and we don't need to change it.

Since this only happens on Ubuntu, is it tied to a linker issue?

no idea

Excluding debuginfo should save space.

yes

Can we split up the build steps into their own groups? Eg, check/lint and tests in their own group?

why? no significant benefits there. We will know if CI failed due to compilation error of tests even without separation right?

arunanshub commented 9 months ago

We will know if CI failed due to compilation error of tests even without separation right?

My suspicion is that the steps before the actual test contribute to filling up the disk space. However if we build them as a separate job, it would use a separate image. After that job is finished, that image will be deallocated, along with the build artifacts. This is just a hypothesis, I have yet to test it out.

this just confirms that rust-cache is good enough, and we don't need to change it.

The question is not about it being good or bad, the question is: why even after caching the dependencies, it throws an "out of space" error? Is caching not enough? If it is not enough, then why is it working on Mac and Windows?

Million dollar question: what exactly triggers the out of space error? And why is mac and windows immune to it?

swanandx commented 9 months ago

My suspicion is that the steps before the actual test contribute to filling up the disk space

don't think so, actual issue here should be that code coverage. It may the culprit to fill up the disk space like https://github.com/taiki-e/cargo-llvm-cov/issues/284 . llvm-cov significantly reduced cache sizes as well.

then why is it working on Mac and Windows?

might be due to difference in size of produced artifacts / caches?

also sccache spammed the cache store https://github.com/bytebeamio/rumqtt/actions/caches . So it is hard to clear up all the caches and start from scratch.

arunanshub commented 9 months ago

also sscache spammed the cache store https://github.com/bytebeamio/rumqtt/actions/caches. So it is hard to clear up all the caches and start from scratch.

It isn't a big issue since GitHub will evict the caches anyway.

might be due to difference in size of produced artifacts / caches?

I find it hard to believe that there is a significant difference in sizes of artifacts/caches produced by Ubuntu and mac/windows.


What's interesting is, the person who wrote cargo-hack deletes several unused dirs before testing on Ubuntu.

https://github.com/taiki-e/cargo-llvm-cov/issues/335#issuecomment-1890331417