thrumdev / blobs

Blobchains on Polkadot and Kusama
https://docs-site-pi.vercel.app
Apache License 2.0
64 stars 8 forks source link

ci: print log on test failure and clean cache #262

Closed gabriele-0201 closed 9 months ago

gabriele-0201 commented 9 months ago

cleaning cargo cache between runs is necessary to avoid using binaries from different prs. The main function of cargo-cache does is removing git repos that have been checked out

cargo install cargo-cache --no-default-features --features ci-autoclean should only be done in the gha docker file. Currently, it is also being executed in the GitHub workflow because the runner needs to be manually updated to make these changes. The command will be removed from the workflow in a follow-up PR once the runner is updated

gabriele-0201 commented 9 months ago

Current dependencies on/for this PR:

This stack of pull requests is managed by Graphite.

gabriele-0201 commented 9 months ago

we have plenty of cache which is available immediatelly

yes, but sharing the cache seems to cause another problem: crates like ikura-shim are not recompiled between runs. Yesterday, while running CI on #251, I encountered the shim version of #260. It seems that a shared $CARGO_HOME may be preventing cargo from recognizing and using the new binary versions, sticking to the old ones instead. I need a method to instruct cargo to recompile only the binaries in our repository while using the cached crates.

Following this guide, I first tested cargo-cache, which offers a seamless way to clear cache on CI with the cargo-cache --autoclean flag. This command simplifies the cleanup by removing checkouts while preserving necessary original files for disk reconstruction.

Initially uncertain about its impact:

  1. I wondered if it would trigger a full recompilation given our dependencies from the polkadot-sdk branch
  2. I questioned whether it would eliminate all locally stored binaries (and cause the recompilation of those)

After conducting the test, I found that CI times remained consistent at 5 minutes, regardless of using cargo cache, and the correct shim version was compiled. Now that I understood better what cargo-cache does I'm more confused that before, I really don't know why it worked.

I aim to understand where the local workspace stores incremental build data and how it influences binary recompilation (deleting items from git checkouts may not provide any benefits)

pepyakin commented 9 months ago

Ok, this is a bit more clear, however I am still not convinced in that it's a good solution. Specifically, I don't understand exactly the reason of failure and how usage of cargo cache mends it.

TBH, I have a suspicion that you are on the wrong path. My reasoning is based on the following:

  1. cargo home mainly stores binaries by cargo install and rustup and sources from crates.io and git.
  2. it's not clear how you came to conclusion that you got the wrong binary exactly. However, I assume that you ran it using cargo run, not cargo install. Therefore, the $CARGO_HOME/bin shouldn't help.
  3. ikura-shim is also not a git/crates.io dependency, so cleaning the source parts of the cargo cache should not help either.

So it doesn't look like this is the root cause. Maybe, removing those caches somehow invalidates the build and triggers rebuilding, or something along those lines?

I aim to understand where the local workspace stores incremental build data and how it influences binary recompilation (deleting items from git checkouts may not provide any benefits)

it shouldn't really matter whether it's local workspace or anything else, or incremental or not, the compiled artifacts are stored under CARGO_TARGET, which is configured to be /cargo_target on the runner.

gabriele-0201 commented 9 months ago

however I am still not convinced in that it's a good solution

I agree, initially I thought cargo-cache was doing something different.

removing those caches somehow invalidates the build and triggers rebuilding, or something along those lines?

Yes, probably that's what happens. Yesterday, it worked only by chance. On the second push, it stopped working.

I will close this PR, fix the last things for #251, and then go back to this problem