snok / container-retention-policy

GitHub action for pruning old GHCR container image versions.
MIT License
186 stars 30 forks source link

Removing untagged images can damage tagged images #63

Closed lukasz-mitka closed 4 months ago

lukasz-mitka commented 1 year ago

Running action like so

      - name: Delete old untagged containers
        id: untagged
        uses: snok/container-retention-policy@96e897805acf21aa2ebc21fdf8e04c879e7daf9d # v2.0.1
        with:
          image-names: image
          account-type: org
          org-name: ${{ github.repository_owner }}
          cut-off: 2 weeks ago UTC
          untagged-only: true
          token: ***

Can damage tagged images:

$ docker pull ghcr.io/org/image:tag
tag: Pulling from org/image
manifest unknown

GitHub Support response:

The way Docker publishes images changed recently. Instead of publishing images directly, it now publishes a hidden image alongside a manifest. You can see this below where it says OS / Arch (2). The (2) represents the hidden image and manifest. This has the unfortunate consequence of breaking tools that detect untagged images (because the hidden image isn't tagged directly). When the hidden image is deleted, it leads to this manifest unknown error. Unfortunately there isn't an obvious fix for this.

provenance is enabled by default since docker/build-push-action@v4 My workaround is to disable it:

 - uses: docker/build-push-action@v4
   with:
     provenance: false
     ...

Solution for already broken images: republish them

sondrelg commented 1 year ago

Do you propose we update the README of this action to notify users about this behaviour, or do you think this is something we can fix by changing the implementation?

lukasz-mitka commented 1 year ago

Do you propose we update the README of this action to notify users about this behaviour, or do you think this is something we can fix by changing the implementation?

My understanding is there's no fix currently, but that's based on GH support answers, didn't verify it myself.

So I propose adding a warning in readme and pinning this issue for the time being.

lukasz-mitka commented 1 year ago

I have confirmed that disabling provenance (provenance: false) fixes the issue.

Workaround:

 - uses: docker/build-push-action@v4
   with:
     provenance: false
     ...
sondrelg commented 1 year ago

That's very helpful. Would you be interested in creating a PR too @lukasz-mitka?

lukasz-mitka commented 1 year ago

No, sorry.

mering commented 1 year ago

Duplicate of #43

rohanmars commented 5 months ago

I fixed this in a new project. https://github.com/dataaxiom/ghcr-cleanup-action, without the providence workaround above. It requires uploading a temporary manifest to unlink the tag, then that can be deleted.

sondrelg commented 5 months ago

Please see https://github.com/snok/container-retention-policy/issues/43#issuecomment-2106346750 @rohanmars. Any reason you don't think that would work?

rohanmars commented 5 months ago

Yes skipping sha's is what I did essentially not to not delete the platform specific images linked from the manifest. It gets more complicated when you want to support multiple tags to the same multiarch image and when you want to actually delete the multiarch image. In the delete case you would want to include these digests.

sondrelg commented 5 months ago

For when you actually want to delete the multi-arch image, I would have thought that you'd be fine as long as you always make sure not to delete the SHAs associated with the current tag. Any old untagged images can then be deleted safely - assuming you're passing in the SHAs of all multi-arch images. Is there anything else that needs to be taken into account?

sondrelg commented 4 months ago

The latest release adds a skip-shas input argument, which can be used to protect against deleting multi-platform images. Please see the new section in the readme for details, and let me know if anything is unclear.

The migration guide for v3 is included in the release post 👍

If you run into any issues, please share them in the issue opened for tracking the v3 release ☺️