pangeo-data / pangeo-stacks

Curated Docker images for use with Jupyter and Pangeo
https://pangeo-data.github.io/pangeo-stacks/
BSD 3-Clause "New" or "Revised" License
17 stars 20 forks source link

add git commit hash to docker tag? #17

Open rabernat opened 5 years ago

rabernat commented 5 years ago

Currently we are tagging our images based on calver: CALVER="$( date '+%Y.%m.%d' )". This could lead to problems if we need to push multiple changes per day.

What if the tags were instead like our helm charts: 19.03.09-a0475df etc?

guillaumeeb commented 5 years ago

What I proposed (and actually implemented) on our helm chart is commit hash for every PR, and without commit hash upon tag, so for "stable" releases.

rabernat commented 5 years ago

That sounds perfect! Can we do the same thing here?

jhamman commented 5 years ago

I'm happy to see the convention updated as you all think best. I do want to make sure we maintain the latest tag as well but other than that, I'm keen to follow your lead.

rabernat commented 5 years ago

How would we implement the commit hash idea? Can someone link to the code that does this in the helm chart? I couldn't find it.

jhamman commented 5 years ago

I think the helm chart is using the TRAVIS_COMMIT_RANGE env variable. We could do the same.

yuvipanda commented 5 years ago

I love using commit hash for this, since then you can always easily tell 'so what are we really running?'. I think adding the date is also a good human readable touch, so we should do CALVER-.

You should use the first 6-7 characters of git rev-parse HEAD - this is the commit hash of the current commit you are building on. If you wanna base it instead on the last time a particular directory (so image) was changed, you can do git log --pretty=format:'%h' -n 1 <directory-name> instead. However, in this case now that #31 has landed, you need to do inheritance checking - if base-image is modified, that needs to trigger a commit hash change for everything else downstream from there...

There's a golang YAML parser quirk that causes buggy annoying behavior if all the parts of a truncated commit hash are numbers. So you should have code that makes sure your truncated hash is not all numbers - just include more chars until it isn't.

You can find code that implements all this in https://github.com/yuvipanda/hubploy/blob/master/hubploy/gitutils.py

guillaumeeb commented 5 years ago

On the helm-chart, this is mainly done through chartpress, see https://github.com/pangeo-data/helm-chart/pull/85/files.

We first run chartpress to populate the version with CALVER, and then rerun it to add the hash upon deploy if not on a git tag, else force only CALVER if on a tag.

@yuvipanda could we do something simple only using git rev-parse HEAD ? Is there duplicated machinery between chartpress and hubploy?

yuvipanda commented 5 years ago

@guillaumeeb git rev-parse HEAD means it'll change whenever the repo changes (for a README change, for example) rather than whenever the image itself changes. The code is duplicated between hubploy and chartpress, I think I literally copy pasted it :D It's only a one-liner tho so...

guillaumeeb commented 5 years ago

And your code implements the inheritance checking too?

yuvipanda commented 5 years ago

@guillaumeeb ah, no it does not :) However, if you have other code that automatically updates FROM tags across an inheritance (since everything is explicitly versioned, you'd have to do this), then the code I have should work automatically.

guillaumeeb commented 5 years ago

So a few questions on how to implement this:

guillaumeeb commented 5 years ago

Any thoughts, @jhamman or @yuvipanda ?