ocp-power-automation / ocp4-upi-kvm

OCP4 on KVM/Power
Apache License 2.0
17 stars 20 forks source link

docker.io ratelimit causes OCS CI to break #69

Closed gitsridhar closed 3 years ago

gitsridhar commented 3 years ago

OCS code in ocs-upi-kvm uses ocp4-upi-kvm to create OCP on libvirt. The master and worker nodes are named as master-X and worker-X .

[svenkat@nx123-ahv ~]$ oc get nodes NAME STATUS ROLES AGE VERSION master-0 Ready master 4h v1.19.0+d59ce34 master-1 Ready master 3h59m v1.19.0+d59ce34 master-2 Ready master 3h59m v1.19.0+d59ce34 worker-0 Ready worker 3h52m v1.19.0+d59ce34 worker-1 Ready worker 3h52m v1.19.0+d59ce34 worker-2 Ready worker 3h51m v1.19.0+d59ce34 [svenkat@nx123-ahv ~]$

With this setup, OCS is deployed using OCS-CI and during post-deploy validation, performs the creation of an docker.io/Nginx image-based pod. This is done repeatedly many hundreds of times by OCS-CI code.

Recently docker.io has implemented a limit on the number of images pull and since OCS-CI uses anonymous/unauthenticated access go docker.io, the limit is hit and causes COS-CI to fail.

The failure in creating pod:

Warning Failed kubelet Failed to pull image "nginx": rpc error: code = Unknown desc = Error reading manifest latest in docker.io/library/nginx: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit Warning Failed kubelet Error: ErrImagePull

One of the remedies for this problem is to name the master and worker nodes not as shown above, but with a unique name, with a randomly generated string attached as a suffix to it. By doing this, the limit may not be hit across OCS-CI implementations (we do have about 4 of them at this point).

If there is any other solution for this docker.io rate limit problem, please let us know.

gitsridhar commented 3 years ago

I am working to make VM name with a random string suffix. I will create a PR.

gitsridhar commented 3 years ago

Raised a pull request https://github.com/ocp-power-automation/ocp4-upi-kvm/pull/70 I did not test this change, I do not have an environment ready. In my ocs-upi-kvm enviroment git submodules are getting wiped out, so no idea how I can test this in my environment.

yussufsh commented 3 years ago

Closing this issue as we already support adding random hex in cluster_id.

yussufsh commented 3 years ago

If you think we still need to change the node names then ensure proper testing and changes in the code will be needed for hostnames, dhcp and dns configs.

bpradipt commented 3 years ago

to avoid docker rate limiting, it's advisable to start using quay.io or other registries