Open kolaente opened 2 years ago
Just for clarification, a repo has no impact on packages:
Should this be a
repouser/org setting or a global config one?
I checked again how I implemented this and currently there are no untagged images in the container registry! (Exception: If you upload a multiarch image, the different arches are untagged images) If you tag and push an image you can later pull that image with the tag and its hash. If a tag gets pushed again the old tag/version gets removed and that deletes the hash reference too. So after that operation there is no untagged image available anymore.
So at the moment the cleanup does not need to remove untagged images because there are none. The question should first be "Should Gitea keep untagged version?"
Use case sounds pretty similar to git gc
which we already automatically run as a cron IIRC.
Should this be enabled automatically?
If it's stable, I'd say so.
Should this be a repo/org setting or a global config one?
I think global is sufficient. Ideally it should just be another cron to cleanup orphaned images, like we already do for orphaned git commits via git gc
.
I've came across this issue after experiencing the same effect. Building multiarch images when only the manifest is tagged, left me with lots of "packages" behind with only the digest (the manifest had only one copy since it was tagged).
Tagging each arch so it gets overwritten makes the "details" tab a bit impractical when you have too much different arch and versions (for matrix builds).
Should this be a repo/org setting or a global config one?
In my case, I would be happy with the exact same global mechanism described (similar to the cron that runs git gc
)
I am also looking for a similar feature, going out of my way to manually prune images is painful.
Doesn't #21658 resolved the issue?
@lunny I didn't test it but I don't think so. The PR allows to configure rules for removal of tags, I just want to remove every image layer not associated with a tag.
Is this still happening?
No, I have 1.20.4 running and it does not happen.
Am 19. September 2023 14:06:27 MESZ schrieb Peiwen Xu @.***>:
Is this still happening?
-- Reply to this email directly or view it on GitHub: https://github.com/go-gitea/gitea/issues/21673#issuecomment-1725380028 You are receiving this because you are subscribed to this thread.
Message ID: @.***>
No one has implemented this yet, but it's definitely a vital feature to conserve disk space.
Maybe it should be disabled by default to support pulling image by hash, which is a rare, but valid use case.
Does anyone tried this cleanup rule?
Does anyone tried this cleanup rule?
Using that and then checking with the preview yields no results, does not look like its working.
It looks like the official docker registry implementation uses this function to find and remove all untagged layers, as described here.
@KN4CK3R As far as I understood from glancing over the code, Gitea does not just "embed" the official registry package, so it's not as easy as just copying or calling that function?
I'm facing the same issue with the latest version of gitea
Does anyone tried this cleanup rule?
The following seems to work perfectly! It deletes all images that do not have an associated tag with them. I would just suggest using ^sha256:.+
instead, as you could otherwise match a tag that for some reason has sha256 in the middle.
Be careful with this approach, when using multi platform images! In this case the individual platform images might be untagged, but the images themselves may still be referenced (by the multi platform manifest that is). I'm currently trying to deal with this problem myself and have not yet found a way that does not require deeper insight into the relationships of the images involved.
If someone has something to suggest that'd be very welcome!
Yes, the cleanup rule delete platform variants images.
Until we can find a integrated solution, I ended up with an external cronjob that prune old images in my self-hosted instance.
I fetched the registry api and used the gitea golang sdk, in a hacky way but It's working.
When pushing new docker images for an existing tag, the old image still exists and uses up storage one the server. While you can use images just by pointing to their sha, I've yet to find someone who actively uses that. For my own registry (portus) I have a cron job to automatically remove everything that does not have a tag pointing to it. Docker even has a command for this.
Having a cleanup job like that would allow to keep old versions but still solve the storage space problem.
@KN4CK3R in https://github.com/go-gitea/gitea/issues/21658#issuecomment-1301794468:
Gitlab has an automatic garbage collection process for this: https://docs.gitlab.com/ee/administration/packages/container_registry.html#removing-untagged-manifests-and-unreferenced-layers
I think it's best to discuss this before implementing, mostly regarding these open questions: