mozilla-services / Dockerflow

Cloud Services Dockerflow specification
Apache License 2.0
199 stars 28 forks source link

include CIRCLE_TAG in the cache key #44

Closed pjenvey closed 4 years ago

pjenvey commented 6 years ago

to avoid deploy steps potentially picking up a previous merge build that wrote to the cache out of order

..which causes our deployment verifier to fail

mostlygeek commented 6 years ago

The cache keys are all different doesn't that break cache restore in the deploy step?

pjenvey commented 6 years ago

How are they are different? CIRCLE_TAG would be equivalent for the related build/deploy steps and restore_cache looks for a prefix (so epoch doesn't factor in)

mostlygeek commented 6 years ago

I just refreshed my memory on how circle's cached worked. Could you write a comment above the cache key commenting on why we chose that schema for posterity?

sciurus commented 4 years ago

Is this change still needed?

sciurus commented 4 years ago

A project I support just hit this problem. The contents of the image we had autodeployed to stage did match what we expected.

Ordered by workflow, here is what seems to have happened

workflow 58 build 144 built an image for a22f2d9, which adds tag 0.0.5, and logged "Stored Cache to v1--1599842496" workflow 58 build 148 logged "Found a cache from build 144 at v1-", restored 4ea2177f, and ran tests against it workflow 58 build 149 logged "Found a cache from build 146 at v1-", restored b9bf2ad5 and pushed it to dockerhub as 0.0.5

workflow 59 build 146 built an image for commit 994c5aa, an unmerged PR bumping dependencies, and logged "Stored Cache to v1-renovate/pin-dependencies-1599842530" workflow 59 build 150 logged "Found a cache from build 146 at v1-renovate/pin-dependencies", restored b9bf2ad5, and ran tests against it

So workflow 58 (the tag on master branch) and 59 (the PR) ran concurrently, and the last step of 58 which pushes the image to dockerhub restored and pushed an image from 59 instead.

This happened because we say to

      - save_cache:
          key: v1-{{ .Branch }}-{{epoch}}
      - restore_cache:
          key: v1-{{.Branch}}

and for master the is no branch, so the key is just v1-. CircleCI docs say that

A key is searched against existing keys as a prefix.

Note: When there are multiple matches, the most recent match will be used, even if there is a more precise match

Build 149 had the choice of restoring either the cache "v1--1599842496" from build 144 or the cache "v1-renovate/pin-dependencies-1599842530" from build 146. Consistent with the docs it chose the latter, even though its from a different workflow.

With this PR, the caches would have been named "v1--0.0.5-1599842496" and "v1-renovate/pin-dependencies--1599842530". We would have tried to restore "v1--0.0.5" so we would have selected the right one.

So I think this is okay. Still @pjenvey I wonder why you chose CIRCLE_TAG instead of CIRCLE_SHA1? Since a tag is mutable, I think there is still potential for cache confusion if someone pushes a tag, then immediately repushes the tag pointing to a new commit.

jbuck commented 4 years ago

I think we'd have to add the identifier to line 69 as well.

I think bpitts is right, adding the SHA1 is probably the most correct way of caching data, even if it means the cache can only be used within the same commit.

sciurus commented 4 years ago

I'm wondering if the below does what we want, or if there is some downside I'm not seeing that lead to Branch being picked originally.

diff --git a/.circleci/config.yml b/.circleci/config.yml
index fc0b225..a5da9c9 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -35,7 +35,7 @@ jobs:
           name: docker save app:build
           command: mkdir -p /cache; docker save -o /cache/docker.tar "app:build"
       - save_cache:
-          key: v1-{{ .Branch }}-{{epoch}}
+          key: v1-{{ .Environment.CIRCLE_SHA1 }}-{{epoch}}
           paths:
             - /cache/docker.tar

@@ -45,7 +45,7 @@ jobs:
     steps:
       - setup_remote_docker
       - restore_cache:
-          key: v1-{{.Branch}}
+          key: v1-{{.Environment.CIRCLE_SHA1}}
       - run:
           name: Restore Docker image cache
           command: docker load -i /cache/docker.tar
@@ -60,7 +60,7 @@ jobs:
     steps:
       - setup_remote_docker
       - restore_cache:
-          key: v1-{{.Branch}}
+          key: v1-{{.Environment.CIRCLE_SHA1}}
       - run:
           name: Restore Docker image cache
           command: docker load -i /cache/docker.tar
sciurus commented 4 years ago

I'm testing that in https://github.com/mozilla-services/topsites-proxy/pull/49

sciurus commented 4 years ago

Superseded by #57

pjenvey commented 2 years ago

Indeed CIRCLE_SHA1 is the way to go, I just ran into a non-tagged dev build utilizing CIRCLE_TAG that still hit this issue. Like bpitts said, it would also be an issue for a re-pushed tag.