cert-manager / testing

Repository containing cert-manager testing infrastructure configuration
3 stars 30 forks source link

Mitigate the impact of Trivy DB download rate limiting on GHCR by using a Trivy DB mirror #1062

Closed wallrj closed 1 month ago

wallrj commented 1 month ago

Mitigate the impact of Trivy DB download rate limiting on GHCR

E.g.

2024-09-30T19:10:14Z FATAL Fatal error init error: DB error: failed to download vulnerability DB: database download error: OCI repository error: 1 error occurred:

By using an alternative registry mirror for the database.

It is more convenient to fix this here in the testing repo than to update the Make rule in makefile modules, then backport it to all the affected release- branches. Hopefully, trivy find some general work around in a future version. Maybe they will persuade GHCR to increase their rate limits. Or they'll switch to a new default registry or find some alternative non-oci mechanism for distributing the DB.

Aim is to get the test grid green before the cert-manager 1.16 release:

image

See:

wallrj commented 1 month ago

Not sure how to test the updated prow job without merging this, but I have tested locally:

$ make trivy-scan-all TRIVY_DB_REPOSITORY=public.ecr.aws/aquasecurity/trivy-db:2
cd cmd/controller && GOOS=linux GOARCH=amd64 CGO_ENABLED=0 GOEXPERIMENT= GOMAXPROCS= /home/richard/projects/cert-manager/cert-manager/_bin/tools/go build -o ../../_bin/server/controller-linux-amd64 -trimpath -ldflags '-w -s -X github.com/cert-manager/cert-manager/pkg/util.AppVersion=v1.16.0-beta.0-1-g440cd54cb542ef -X github.com/cert-manager/cert-manager/pkg/util.AppGitCommit=440cd54cb542efafceb473b2f91538463e378d64' main.go
docker build --quiet \
        -f hack/containers/Containerfile.controller \
        --build-arg BASE_IMAGE=gcr.io/distroless/static-debian12@sha256:262ae336f8e9291f8edc9a71a61d5d568466edc1ea4818752d4af3d230a7f9ef \
        -t cert-manager-controller-amd64:v1.16.0-beta.0-1-g440cd54cb542ef \
        _bin/scratch/build-context/cert-manager-controller-linux-amd64/ >/dev/null
docker save cert-manager-controller-amd64:v1.16.0-beta.0-1-g440cd54cb542ef -o _bin/containers/cert-manager-controller-linux-amd64.tar >/dev/null
[info]: downloaded /home/richard/projects/cert-manager/cert-manager/_bin/downloaded/tools/trivy@v0.54.1_linux_amd64
/home/richard/projects/cert-manager/cert-manager/_bin/tools/trivy image --input _bin/containers/cert-manager-controller-linux-amd64.tar --format json --exit-code 1
2024-10-01T10:49:14+01:00       INFO    [db] Need to update DB
2024-10-01T10:49:14+01:00       INFO    [db] Downloading DB...  repository="public.ecr.aws/aquasecurity/trivy-db:2"
53.85 MiB / 53.85 MiB [------------------------------------------------------------------------------------] 100.00% 4.03 MiB p/s 14s
...
inteon commented 1 month ago

We could also fix this in our Makefiles. /approve /lgtm

cert-manager-prow[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: inteon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/cert-manager/testing/blob/master/OWNERS)~~ [inteon] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
cert-manager-prow[bot] commented 1 month ago

@wallrj: Updated the job-config configmap in namespace default at cluster default using the following files:

In response to [this](https://github.com/cert-manager/testing/pull/1062): >Mitigate the impact of Trivy DB download rate limiting on GHCR > >E.g. >> 2024-09-30T19:10:14Z FATAL Fatal error init error: DB error: failed to >> download vulnerability DB: database download error: OCI repository error: 1 >> error occurred: >> * GET https://ghcr.io/v2/aquasecurity/trivy-db/manifests/2: TOOMANYREQUESTS: retry-after: 203.192µs, allowed: 44000/minute >> -- https://prow.infra.cert-manager.io/view/gs/cert-manager-prow-artifacts/logs/ci-cert-manager-release-1.16-trivy-test-cainjector/1840831081448738816 > >By using an alternative registry mirror for the database. > >It is more convenient to fix this here in the testing repo than to update the Make rule in makefile modules, >then backport it to all the affected release- branches. >Hopefully, trivy find some general work around in a future version. Maybe they will persuade GHCR to increase their rate limits. >Or they'll switch to a new default registry or find some alternative non-oci mechanism for distributing the DB. > >Aim is to get the test grid green before the cert-manager 1.16 release: > * https://testgrid.k8s.io/cert-manager-periodics-release-1.16 > >image > > >See: >- https://kubernetes.slack.com/archives/CDEQJ0Q8M/p1727433814116729 >- https://github.com/aquasecurity/trivy-action/issues/389 >- https://aquasecurity.github.io/trivy/v0.55/docs/configuration/db/#db-repository >- https://aquasecurity.github.io/trivy/v0.55/docs/configuration/#environment-variables > Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
inteon commented 1 month ago

@wallrj I noticed this error: "init error: DB error: failed to download vulnerability DB: database download error: repository name error (public.ecr.aws/aquasecurity/trivy-db:2:2): could not parse reference: public.ecr.aws/aquasecurity/trivy-db:2:2"

https://storage.googleapis.com/cert-manager-prow-artifacts/logs/ci-cert-manager-release-1.14-trivy-test-webhook/1841117219644248064/build-log.txt