Closed MattWellie closed 7 months ago
Thanks for the suggestion @MattWellie!
I've been thinking about this a bit over the week, I'm totally here for this PR, but our current build mechanism causes a full image rebuild everytime this is run, and almost certainly changes in docker hashes => extra layers, a larger image size. I wonder if it's worth breaking these up into different tagged images, so we're able to pull in the specific image to reduce the time to build (which I'd really love).
On a similar note, it would be great to move this image to the images repo, and as part of that, have some way to chain images together, so a rebuild of some base image could cause a rebuild to chained images. How that works with floating tags I'm not 100% sure yet.
FYI @illusional https://github.com/populationgenomics/images/issues/139 (I haven't assigned you, but you should be aware that this is a related issue)
Closing this, related issue about beefing up our image building more generally
This started as a deletion of the
sample-metadata
dependency, which is redundant with the inclusion of the newermetamist
dependency. happy to sit back on that as a minimal change (unless there's a reason we are keeping both deps?)The proposed changes split the crazy 3.6GB mono-layer into 3 components:
My belief is that this change:
never-changes
,rarely-changes
, andmay-frequently-change
in chronological order, so that instead of pulling 3.6GB to get the latest version of this image, you'd need to pull the variable ~MB layer (provided that you have a prior version of this image on the appropriate server/local machine).AFAIK this is standard docker theory, and the current design we have is non-optimal
Update: layer sizes
This results in 3 layers:
So unsurprisingly most of the weight is in the Hail installation, but it's a little more spread out. I'm experimenting with moving PhantomJS into the relatively static layer 1 as well.