dataaxiom / ghcr-cleanup-action

GitHub Container Registry Cleanup Action
BSD 3-Clause "New" or "Revised" License
18 stars 5 forks source link

Fails as it tries to delete the same package twice #37

Closed fmoessbauer closed 1 month ago

fmoessbauer commented 2 months ago

Hi,

we tried to apply this action to the siemens/kas repository, but there it fails as it tries to delete the same package twice (in the same run):

[...]
 deleting package id: 183706397 digest: sha256:44871a3c4628a192347a09e39e7e6f5bbdb7e68f95fa8ab3914be3d2301aca7d architecture: amd64
[...]
 deleting package id: 183706397 digest: sha256:44871a3c4628a192347a09e39e7e6f5bbdb7e68f95fa8ab3914be3d2301aca7d architecture: amd64
Error: Package version not found. - https://docs.github.com/rest/packages/packages#delete-package-version-for-an-organization

Full CI output: https://github.com/siemens/kas/actions/runs/10011297384/job/27675632510

rohanmars commented 1 month ago

I believe it's related to #36. Where multiple mult architecture manifests point to the same underlying image (which didn't not change between builds). I'm working on a fix for that right now.

rohanmars commented 1 month ago

I've refactored the core logic df5575c, during which I discovered that this issue was that the child images where not being guarded against this scenario where different multi arch images reference the same platform image digest. I believe I've fixed this issue now (currently on main branch). I'll hopefully release later this week.

If there is any way to test this that would be great.

https://github.com/dataaxiom/ghcr-cleanup-action/blob/df5575cf39073931f0ab604166a54216398b1eee/src/main.ts#L258

rohanmars commented 1 month ago

I've added a test case c75dc10 from #36 to the project CI workflow which tests this issue indirectly.

fmoessbauer commented 1 month ago

Hi, I gave it a try in on ghcr.io/fmoessbauer/kas, a registry copy of the official ghcr.io/siemens/kas registry. There, it worked correctly. Job log: https://github.com/fmoessbauer/kas/actions/runs/10347778072/job/28639207637

Note, that the registry copy was created with skopeo sync --all --src docker --dest docker ghcr.io/siemens/kas/kas ghcr.io/fmoessbauer/kas/ and by that only contains artifacts which are referenced. AFAIK, the OCI registry spec does not provide an endpoint to list all artifacts.

rohanmars commented 1 month ago

That's great to hear. I'm planning to release v1.0.8 later this week to incorporate the latest fixes.

rohanmars commented 1 month ago

Looking at your test run yesterday, I think it didn't recreate the same setup, likely as the images copied didn't contain references to the same platform images. Which your original log showed.

I've implemented an additional test 7f993f3 which outputs the following action log, which is very close to your original (except fixed): https://github.com/dataaxiom/ghcr-cleanup-action/actions/runs/10361019261/job/28681206924

deleting all untagged images
 deleting package id: 257029904 digest: sha256:4b45a528e7526cc38ea35abdcce905c8cbeab1c6d25f78469bd308273cb53edf
 skipping deletion of sha256:f027150874f5e7fd517721736267f83a8eb7b0d40d6d10033f347fbbe2175f18 as it's in use by another image
 skipping deletion of sha256:9e9cf370930547e1eb4683d53348196bbfeecc6243f0c43787c7eb81488f8fd9 as it's in use by another image
 deleting package id: 257029900 digest: sha256:3338b3399136bf4d92641049f93b9b4cb2a7341f40e1fd442184065038aa7a9d application/vnd.in-toto+json
 deleting package id: 257029902 digest: sha256:2b4fdc5b5200ca15f84f4dff65910d18417060314456fdcff10f209b0faf4a14 application/vnd.in-toto+json
 deleting package id: 257029892 digest: sha256:372a4e37c8f870657e7f18e2d18c32ddf66c8056ecdc3ae6de1b70644aaaf1ef
 deleting package id: 257029888 digest: sha256:f027150874f5e7fd517721736267f83a8eb7b0d40d6d10033f347fbbe2175f18 architecture: amd64
 deleting package id: 257029889 digest: sha256:9e9cf370930547e1eb4683d53348196bbfeecc6243f0c43787c7eb81488f8fd9 architecture: arm64
 deleting package id: 257029887 digest: sha256:3b01dd7f8f628ece2819f7319a468efcd5b42e9e0d3ff49245c9f173c6790780 application/vnd.in-toto+json
 deleting package id: 257029890 digest: sha256:192b7f18a2d2045922e699b9a18d3b940ab08bec1e68c6813a124d77d16aad73 application/vnd.in-toto+json
validating multi-architecture/referrers images:
 no errors found
cleanup statistics:
 multi architecture images deleted = 2
 total images deleted = 8

The important change is that it doesn't delete platform images where they are in use elsewhere. Which can happen by buildx cached images apparently.

fmoessbauer commented 1 month ago

I've implemented an additional test https://github.com/dataaxiom/ghcr-cleanup-action/commit/7f993f33af9dfb768ec7a9ba00b38e5967b75345 which outputs the following action log, which is very close to your original (except fixed):

Thanks, much appreciated. Recreating these conditions on a registry is challenging, as you cannot easily create a 1 to 1 copy of an existing one. And you also don't want to test new features against production repos (we learned that the hard way when applying the actions/delete-package-versions to kas, which STILL eats multiarch images). That's why we came here in the first place :smile:

rohanmars commented 1 month ago

That's why I came here also, haha. I couldn't find an action that didn't eat packages and seamlessly worked in forked repos.

I've released v1.0.8/v1 ff99a6e, so closing this issue.

Thanks for your feedback/help.