E3SM-Project / e3sm_diags

E3SM Diagnostics package
https://e3sm-project.github.io/e3sm_diags
BSD 3-Clause "New" or "Revised" License
39 stars 32 forks source link

[DevOps]: Speed up integration tests on GitHub Actions by caching test data downloads #740

Open tomvothecoder opened 11 months ago

tomvothecoder commented 11 months ago

Overview

We run the CI/CD workflow for pull requests and pushes to main, which takes around 11-13 minutes.

The testing step takes up ~75% of the build time. Most of this time is from downloading the test data from LCRC.

The Problem

This slow build time has a hidden cost that impacts overall productivity:

...developers could either be waiting the entire time a build runs or end up context-switching to work on something else while a build runs. Both of these impact overall productivity (more on this below). -- https://github.blog/2022-12-08-experiment-the-hidden-costs-of-waiting-on-slow-build-times/

Possible Solutions

Look into caching the test download data. We need to develop a scheme for refreshing the cached download data too.

https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows

mahf708 commented 10 months ago

+1 on the productivity, and thanks for including the links 😃

tomvothecoder commented 10 months ago

+1 on the productivity, and thanks for including the links 😃

Of course @mahf708! I'm always onboard for efficiency and improvements.

tomvothecoder commented 9 months ago

In PR #747, I cleaned up integration tests and removed redundant ones. This cut the total build time to ~6 minutes. Here's a build run showing these improvements: https://github.com/E3SM-Project/e3sm_diags/actions/runs/7199737714

The remaining integration test performs an image diff check. It executes a diagnostic run using default sets and a list of parameters. This test diagnostic run is pretty heavy, which is why it takes ~3 minutes.

It actually only takes ~30 secs to download the integration test data and images. It is still a good idea to cache these resources because they don't change often.