There was a small outrage reported by uptimerobot for plone.org in week 3. I checked this weekend the state of the cluster and tried to rebalance the frontend and backend containers because they are all on workers 2 and 3. And then I noticed that the images for backend and frontend can no longer be found according to the swarm manager:
docker service update plone-org_backend
plone-org_backend
overall progress: 0 out of 2 tasks
1/2: No such image: ghcr.io/plone/ploneorg-backend:latest@sha256:3656f53c665125…
2/2:
and also:
ukhx25v7zle7 plone-org_backend.1 ghcr.io/plone/ploneorg-backend:latest worker02 Running Running 21 hours ago
fshf9gl1ju9c \_ plone-org_backend.1 ghcr.io/plone/ploneorg-backend:latest worker04 Shutdown Rejected 21 hours ago "No such image: ghcr.io/plone/…"
2ii6qcy2zqr7 \_ plone-org_backend.1 ghcr.io/plone/ploneorg-backend:latest worker04 Shutdown Rejected 21 hours ago "No such image: ghcr.io/plone/…"
sn6sc8qmgg6k \_ plone-org_backend.1 ghcr.io/plone/ploneorg-backend:latest worker04 Shutdown Rejected 21 hours ago "No such image: ghcr.io/plone/…"
n7xm96y4es7t \_ plone-org_backend.1 ghcr.io/plone/ploneorg-backend:latest worker04 Shutdown Rejected 21 hours ago "No such image: ghcr.io/plone/…"
Two possible cause: either the images have been removed from ghcr, because we are at our maximium capacity. Or the PAT that we use in our deploy scripts (DEPLOY_GHCR_READ_TOKEN) has expired (it has), docker swarm managers still have the old key in it's distributed configuration .
Threre was a special way to update that authentication token directly on the manager with some special service update commands, but pushing out a new release is the quickest solution.
And we could check if and then how our images are impacted on ghcr.io.
There was a small outrage reported by uptimerobot for plone.org in week 3. I checked this weekend the state of the cluster and tried to rebalance the frontend and backend containers because they are all on workers 2 and 3. And then I noticed that the images for backend and frontend can no longer be found according to the swarm manager:
and also:
Two possible cause: either the images have been removed from ghcr, because we are at our maximium capacity. Or the PAT that we use in our deploy scripts (DEPLOY_GHCR_READ_TOKEN) has expired (it has), docker swarm managers still have the old key in it's distributed configuration .
Threre was a special way to update that authentication token directly on the manager with some special service update commands, but pushing out a new release is the quickest solution.
And we could check if and then how our images are impacted on ghcr.io.