GoogleCloudPlatform / gcr-cleaner

Delete untagged image refs in Google Container Registry or Artifact Registry
Apache License 2.0
805 stars 112 forks source link

Fix bugs in recursive repo resolution #82

Closed sethvargo closed 2 years ago

sethvargo commented 2 years ago

The previous recursive implementation is flawed in a few ways:

  1. It queries the entire registry for each repo, which is horribly inefficient.

  2. It misses cases where the repo FQDN might not exactly match the given root (e.g. trailing slashes).

  3. It lacks decision logging, making it incredibly difficult to debug.

The New and Improved Implementation :tm: removes the need for recursion:

  1. Since the Docker v2 API requires we list the entire catalog anyway, the new implementation parses the repos and extracts and de-dupes the registry components. Since it is impossible for a catalog entry to point to something outside of its registry, this is same to pre-compute.

  2. Each registry is queried exactly once, and compared against the supplied root prefixes. If the catalog entry begins with the root, it is considered a valid target. This means a repo of "gcr.io/foo" lists all of gcr.io and then does client-side selection any catalog entries that begin with "gcr.io/foo".

  3. Every action and decision is logged at the debug level, so users and maintainers can more easily identify potential issues.