AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
126 stars 19 forks source link

Implement Docker image cache cross-environment support #3299

Closed arkid15r closed 1 year ago

arkid15r commented 1 year ago

Issue Number

Closes #3288

Purpose/Implementation Notes

This PR consolidate the recent efforts towards refine.bio Docker images build process optimization.

Every image name is now based on the branch hash value and cache layers are uploaded into a registry for further sharing purpose. This allowed retiring Docker.affymetrix_local image and related prepare_image.sh logic.

The API image has been split into base and local/production images. The affymetrix/agilent tests are 2 separate jobs now and use larger runners (-m suffix).

These changes resulted in >50% (~1 hour vs ~30 minutes) tests execution time improvement: run a or run b VS this PR run.

Some points to keep in mind during review:

pre-commit-terraform has been disabled temporarily due to unrelated failures.

The PR contains all changes required for further staging/production deploy process optimization identified at this moment.

Checklist