We sometimes get rate limited by docker in the e2e tests. When this happens, image pulls fail - and therefore the entire e2e test job fails as a result.
As a recent example, I saw a couple cases where deploying the components failed with:
Waiting for daemon set "neonvm-device-plugin" rollout to finish: 0 of 3 updated pods are available...
Error: The action 'deploy components' has timed out after 3 minutes.
and when looking at the events, we see:
LAST SEEN TYPE REASON OBJECT MESSAGE
2m50s Normal Scheduled pod/neonvm-device-plugin-blskl Successfully assigned neonvm-system/neonvm-device-plugin-blskl to k3d-neonvm-agent-0
2m49s Normal AddedInterface pod/neonvm-device-plugin-blskl Add eth0 [10.0.0.154/32] from cilium
75s Normal Pulling pod/neonvm-device-plugin-blskl Pulling image "squat/generic-device-plugin"
72s Warning Failed pod/neonvm-device-plugin-blskl Failed to pull image "squat/generic-device-plugin": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/squat/generic-device-plugin:latest": failed to copy: httpReadSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/squat/generic-device-plugin/manifests/sha256:ba6f0b4cf6c858d6ad29ba4d32e4da11638abbc7d96436bf04f582a97b2b8821: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
72s Warning Failed pod/neonvm-device-plugin-blskl Error: ErrImagePull
57s Warning Failed pod/neonvm-device-plugin-blskl Error: ImagePullBackOff
45s Normal BackOff pod/neonvm-device-plugin-blskl Back-off pulling image "squat/generic-device-plugin"
Problem
We sometimes get rate limited by docker in the e2e tests. When this happens, image pulls fail - and therefore the entire e2e test job fails as a result.
As a recent example, I saw a couple cases where deploying the components failed with:
and when looking at the events, we see:
ref
Potential solutions
Maybe we can specify credentials for dockerhub with this registries configuration file? https://k3d.io/v5.6.0/usage/registries/
We might also want to look into implementing this for kind, but that's lower priority because we aren't regularly using it in CI.