Closed Macariel closed 8 months ago
Hi @Macariel - thank you for opening a bug.
Untagged images are only found for images starting with sha:00 up to sha:47 (can clearly be seen because the list is sorted). There are no images in there starting with sha:5... or any other hex number.
Are the images included in the debug output? One of the very first lines will be the full list of images (as JSON) that it found.
Not all tagged images are found. For some reason ts_35121... is found but not ts_32555... even though they have been tagged on the same day
Same question - are those tags present in the list of all things gcr-cleaner found?
My initial guess is that the repo is so large, you're hitting rate limit/quota issues. Those should return an error though, so I'm skeptical.
Assuming you have enough CPUs, you can crank up the concurrency with -concurrency=1000
. That will do 1000 operations in parallel, which should speed things up.
Hi @sethvargo thank you for taking a look!
I've captured the debug output using the following command:
GCRCLEANER_LOG=debug ./gcr-cleaner-cli -grace="320h" -repo europe-west1-docker.pkg.dev/project/repo/repo -tag-filter-all "^(ts_.*)" > all 2>&1 --dry-run
Searching in the file for sha256:5
or ts_32555
yields no results. There are also no errors or warnings for that matter.
Looking for sha256:4
gives me results in the huge second line.
As for the rate limit, adjusting the grace period yields different results depending on the value:
Not sure if a rate limit would be dependent on the grace period here, also there are "only" around 50000 images in the repository,doesn't sound like too much for google. Also I can fetch all images with the following gcloud command without a problem (while using the same admin account):
gcloud artifacts docker images list europe-west1-docker.pkg.dev/project/repo/repo --limit 100000 --sort-by "UPDATE_TIME" --include-tags --format=json
https://cloud.google.com/artifact-registry/quotas#project-quota
Notably this section:
The Docker Registry API method to list images returns an incomplete list if a repository has more than 10,000 images or tags. This limitation applies to Docker clients that use the Docker Registry API to interact with registries. The limitation does not apply to the gcloud artifacts docker images list command or Artifact Registry API requests.
We use the Docker registry API because GCR cleaner supports all OCI registries. That appears to be the root of the problem here.
I guess this is not a bug then. If you don't see a good way around it you can close the ticket :thinking: Thank you very much for this insight!
TL;DR
Not all relevant images are being found and subsequently deleted. Some images are omitted due to unknown reasons.
Expected behavior
All images matching the filter or all untagged images created after the grace period should be listed and deleted
Observed behavior
Running the simple command
./gcr-cleaner-cli -repo europe-west1-docker.pkg.dev/project/repo/repo -grace="320h" -tag-filter-all "^ts_.*" --dry-run
gives a list of all images which would be deleted:However there are multiple problems with the output:
sha:00
up tosha:47
(can clearly be seen because the list is sorted). There are no images in there starting withsha:5...
or any other hex number.ts_35121...
is found but notts_32555...
even though they have been tagged on the same dayRemoving the tag filter has the same outcome as does changing the grace value to bigger or smaller values. The total amount of images changes, so there does not seem to be a hard cap on the amount of images, it must be something different.
Coupled with the fact that we have multi-arch builds, so there's a lot of images in there that cannot be deleted due to dangling parents, this means that we are barely deleting any images in a 3h window.
Looking through the output by running the command with
GCRCLEANER_LOG=DEBUG
does not list any of the missing images as well.Debug log output
How are you running gcr-cleaner?
CLI
gcr-cleaner version
v0.11.1
Environment
I installed with
go install github.com/GoogleCloudPlatform/gcr-cleaner/cmd/gcr-cleaner-cli@v0.11.1
and I am using it under linuxThe repository in question is very big. There's currently almost about 10TB of images in there, that's why we want to clean this up. The runtime is therefor also pretty slow and with a grace period of 720h we are barely getting the job to run in under 3h. However the dry run is very fast.
Additional information
No response