populationgenomics / images

Reusable single-software docker images
MIT License
1 stars 1 forks source link

Spicy image building #139

Open MattWellie opened 5 months ago

MattWellie commented 5 months ago

For context, I do not have a great idea of how to solve this issue. Or if it truly is an issue.

Vaguely related: https://github.com/populationgenomics/analysis-runner/pull/682

Our image building could be much more efficient - we don't preserve layers between builds, even when the contents of a layer could be preserved between builds (e.g. we have 23(?)?? images which separately install micromamba)

Something fun would be reconsidering our images, separating static content (OS, compilers, system packages etc.), semi-static content (versioned python installations? installation of our own hail fork/version), and variable content (our internal software libraries, 3rd party tools we expect to update regularly, versions linked to individual repository hashes) and come up with a more interactive build plan:

For inspiration: https://github.com/broadinstitute/gatk-sv/blob/main/scripts/docker/build_docker.py

What we hope to gain from this: