eclipse-che / che

Kubernetes based Cloud Development Environments for Enterprise Teams
http://eclipse.org/che
Eclipse Public License 2.0
6.96k stars 1.19k forks source link

Move all docker images out of dockerhub #18292

Closed sparkoo closed 3 years ago

sparkoo commented 3 years ago

Describe the bug

With new dockerhub pull limits [1][2], which limits anonymous user to 100pulls/6hours, we need to move all our images out of this registry.

This can have impact on development process (I've hit the limit on CI build job https://ci.centos.org/job/devtools-che-pullrequests-build/1663/console).

I think it may affect Che user experience as well. When cluster itself hit the limit and we use plugins/components from dockerhub, the whole Che might become unusable for 6 hours and this is not acceptable.

So this change should include:

[1] - https://docs.docker.com/docker-hub/download-rate-limit/ [2] - https://www.docker.com/pricing

Expected behavior

All our common docker images should be pulled without limits that might block us.

Image list

Done? Docker.io image Replacement image Related issues / WIP / PRs
:+1: docker.io/centos/mongodb-36-centos7 quay.io/eclipse/che--centos--mongodb-36-centos7:latest-a915db7beca87198fcd7860086989fe8a327a1a4f6508025b64ab28fcc7423b2 none, but will need to manually fetch newer image at some point
:+1: docker.io/centos/mysql-57-centos7 quay.io/eclipse/che--centos--mysql-57-centos7:latest-e08ee4d43b7356607685b69bde6335e27cf20c020f345b6c6c59400183882764 none, but will need to manually fetch newer image at some point
--- --- --- ---
:construction: docker.io/centos/postgresql-96-centos7 registry.centos.org/centos/postgresql-96-centos7 https://github.com/eclipse/che/pull/18410#issuecomment-739922638 (pr to move to centos.org image); https://github.com/eclipse/che/pull/16903/files (discussion re: postgres 10, multiarch)
:sos: docker.io/traefik:v2.2.8 ? ?
--- --- --- --- ---
? codercom/code-server ? ?
? docker.io/dirigiblelabs/dirigible-openshift:3.4.0 ? ?
? docker.io/ksmster/che-editor-jupyter:5.7.0 ? ?
? docker.io/wsskeleton/eclipse-broadway ? ?
rhopp commented 3 years ago

I'm raising priority of this issue, as this is currently making the Happy Path tests fails very often.

nickboldt commented 3 years ago

Images on docker.io to which the Che 7.21.1 CSV refers include:

docker.io/centos/mongodb-36-centos7
docker.io/centos/mysql-57-centos7
docker.io/centos/postgresql-96-centos7
docker.io/chinodesuuu/coder:2.1523-vsc1.38.1-che
docker.io/chinodesuuu/coder:2.1650-vsc1.39.2-che
docker.io/chinodesuuu/coder:next
docker.io/dirigiblelabs/dirigible-openshift:3.4.0
docker.io/ksmster/che-editor-jupyter:5.7.0
docker.io/traefik:v2.2.8
docker.io/windup3/rhamt-vscode-extension:java8
docker.io/wsskeleton/eclipse-broadway
ericwill commented 3 years ago

What if we used images from the Red Hat container catalog?

docker.io/centos/mongodb-36-centos7

Replace with rhscl/mongodb-36-rhel7 Size: ~177MB vs ~181MB which we have now.

docker.io/centos/mysql-57-centos7

Replace with rhel8/mysql-80 Size: ~186MB vs ~146MB which we have now.

docker.io/centos/postgresql-96-centos7

Replace with rhel8/postgresql-96 Size: ~151MB vs. ~117MB which we have now.

docker.io/chinodesuuu/coder:2.1523-vsc1.38.1-che docker.io/chinodesuuu/coder:2.1650-vsc1.39.2-che docker.io/chinodesuuu/coder:next

These are all code-server editor images, AFAIK they are community contributed so not really our problem.

docker.io/dirigiblelabs/dirigible-openshift:3.4.0

Not sure if we maintain this or not.

docker.io/ksmster/che-editor-jupyter:5.7.0

Judging by https://github.com/ws-skeleton/che-editor-jupyter, we could probably build this ourselves and stick it on quay.io

docker.io/traefik:v2.2.8

Not used by the plugin/devfile registry so I cannot comment. :smiley:

docker.io/windup3/rhamt-vscode-extension:java8

Can probably be moved to quay.io, but AFAIK this is a community contributed plugin.

docker.io/wsskeleton/eclipse-broadway

This is a POC, from my POV we can remove it all together.

l0rd commented 3 years ago

docker.io/traefik:v2.2.8

~Limits do not apply to official images. We should be fine with that one~. Limits apply to official images too.

benoitf commented 3 years ago

For postgresql it's under discussion since may from Platform team: https://github.com/eclipse/che/pull/16903

ericwill commented 3 years ago

For postgresql it's under discussion since may from Platform team:

16903

So there are two then, the one in the devfile registry and the one being discussed in that issue.

sparkoo commented 3 years ago

docker.io/traefik:v2.2.8

Limits do not apply to official images. We should be fine with that one

Are you sure? I can't find any note about this anywhere. However, I can see this on FAQ (https://www.docker.com/pricing/resource-consumption-updates)

Does it matter what image I am pulling? No, all images are treated equally. The limits are entirely based on the account level of the user doing the pull, not the account level of the repository owner.

l0rd commented 3 years ago

@sparkoo I may have confused that with the image expiration enforcement for inactive images. On the other hand in this blog post Docker is mentioning that:

For the approved, non-commercial, open source projects, we are thrilled to announce that we will suspend data pull rate restrictions, where no egress restrictions will apply to any Docker users pulling images from the approved OSS namespaces.

sparkoo commented 3 years ago

@l0rd it is counting to my limits. I'm using anonymous and I have no idea why I have 500 limit, but as you can see below RateLimit-Remaining indeed decreased after pull.

[~] λ curl --head -H "Authorization: Bearer $TOKEN" https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest
HTTP/1.1 200 OK
Content-Length: 2782
Content-Type: application/vnd.docker.distribution.manifest.v1+prettyjws
Docker-Content-Digest: sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020
Docker-Distribution-Api-Version: registry/2.0
Etag: "sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020"
Date: Fri, 13 Nov 2020 06:23:49 GMT
Strict-Transport-Security: max-age=31536000
RateLimit-Limit: 500;w=21600
RateLimit-Remaining: 497;w=21600

[~] λ docker pull docker.io/traefik:v2.2.8                                                                                                                        
Trying to pull docker.io/traefik:v2.2.8...
Getting image source signatures
Copying blob 636569b0d16c skipped: already exists  
Copying blob cbdbe7a5bc2a [--------------------------------------] 0.0b / 0.0b
Copying blob f16506d32a25 [--------------------------------------] 0.0b / 0.0b
Copying blob 1f5ad3441e73 [--------------------------------------] 0.0b / 0.0b
Copying config c6e6fde072 done  
Writing manifest to image destination
Storing signatures
c6e6fde072260f39a81ffcd1eaaa9e0f3aa8de874d1805a14e7315f0e19c344b

[~] λ curl --head -H "Authorization: Bearer $TOKEN" https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest
HTTP/1.1 200 OK
Content-Length: 2782
Content-Type: application/vnd.docker.distribution.manifest.v1+prettyjws
Docker-Content-Digest: sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020
Docker-Distribution-Api-Version: registry/2.0
Etag: "sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020"
Date: Fri, 13 Nov 2020 06:24:02 GMT
Strict-Transport-Security: max-age=31536000
RateLimit-Limit: 500;w=21600
RateLimit-Remaining: 496;w=21600
l0rd commented 3 years ago

You are right @sparkoo, I have the same behavior.

nickboldt commented 3 years ago

This is literally a blocker as the che 7.22 release today failed due to rate limit:

Step 1/23 : FROM docker.io/alpine:3.11.5 AS builder
toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

-- https://ci.centos.org/job/devtools-che-devfile-registry-release/87/console

dmytro-ndp commented 3 years ago

movement is needed for Centos CI jobs, to unblock Che images build (e.g. che-theia image nightly build https://ci.centos.org/view/Devtools/job/devtools-che-theia-che-build-master/277/console).

nickboldt commented 3 years ago

Script to copy from docker.io/org/image:tag to quay.io/eclipse/che--org--image:tag-SHA:

https://github.com/eclipse/che-release/pull/9

I've run it successfully a few times and it's created 9 repos here, including making the repos public and populated with team/robot users so others can add content.

https://quay.io/organization/eclipse

image

Using those quay versions of images, we can update Che operator's CSV to point to these "safe" images, and not be hit with rate limits.

NOTE that with this script we could also copy images from registry.redhat.io so as to avoid the authentication requirement from that registry... and finally be able to use the latest UBI8 based build images, instead of alpine, ubuntu or RHEL7 ones. Why?

well, here's a hint why that would be good:

image

image

nickboldt commented 3 years ago

PRs for plugin & devfile reg, which we could/should backport to 7.22.x too so that we can test drive the release and verify this makes it more robust.

Merged w/ Eric's approval, and cherrypicked to 7.22.x where branch exists.

nickboldt commented 3 years ago

Next up, PR for operator:

I think I hit all the required files but @tolusha LMK if I missed anything.

l0rd commented 3 years ago

Some centos images that are mirrored on registry.centos.org and quay.io:

dockerhub registry.centos.org quay.io
docker.io/centos/mongodb-36-centos7 registry.centos.org/centos/mongodb-34-centos7
docker.io/centos/mysql-57-centos7 registry.centos.org/centos/mysql-80-centos7 quay.io/app-sre/mysql
docker.io/centos/postgresql-96-centos7 registry.centos.org/centos/postgresql-96-centos7 quay.io/app-sre/postgres
tolusha commented 3 years ago

@nickboldt quay.io/eclipse/che--centos--postgresql-96-centos7:9.6-b681d78125361519180a6ac05242c296f8906c11eab7e207b5ca9a89b6344392 quay.io/eclipse/che--traefik:v2.2.8-f5af5a5ce17fc3e353b507e8acce65d7f28126408a8c92dc3cac246d023dc9e8 Can we use shorter tags ? like 9.6 and v2.2.8 ?

l0rd commented 3 years ago

@tolusha @nickboldt I would suggest that you use postgres from quay.io that is mirrored by app-sre team rather than mirroring it ourself.

nickboldt commented 3 years ago

Can we use shorter tags ? like 9.6 and v2.2.8 ?

We could, yes. But since I was blindly copying from upstream projects that we don't control, I figured using a SHA to prevent the tag from moving suddenly was worthwhile. This way we would never have a situation where :9.6 in quay != :9.6 in docker.io, and might introduce unexpected behaviour or confuse people.

nickboldt commented 3 years ago

use postgres from quay.io

Great idea but it's harder to know what's 100% compatible when comparing:

So rather than introducing unexpected changes between 9.6.?? and 9.6.20 (is that a regression? is that an update?) I figured for the 7.22 release we should JUST take the same bits. Too late for potentially breaking changes. :)

For 7.23, I agree, we need PRs to move to app-sre base images & sidecars.

Of course we COULD also copy from RHEC to quay so that we no longer need to worry about the authentication issue, get faster CVEs, and could then start using the same images in CRW and Che for more parity / consistency:

vs.

l0rd commented 3 years ago

I would prefer if someone else maintain those images. As it was before. And the earlier we change to the final image the better. I don't see why we should wait. If those images don't work I want to know it early. If we are ready, or not, to release Che 7.22 should be a QE decision.

nickboldt commented 3 years ago

If we are ready, or not, to release Che 7.22 should be a QE decision.

It hasn't been a QE decision for months. We specifically removed that gate so that we could do releases faster. Now the only gates are:

But +100 for moving to the latest, supported, someone-else-maintains-them images. Problem is ... of the 3 mentioned above, only 2 are in the app-sre repo, and 2 have high-impact CVEs.

l0rd commented 3 years ago

@nickboldt if the criteria are CVEs let's move to quay.io/app-sre/postgres:9.6.20-alpine:

image

The images you are mirroring from Docker Hub have 30+ vulns:

image

l0rd commented 3 years ago

Mysql on registry.centos.org and MySQL on docker hub are built from the same GitHub repo. But CentOS has a more recent version.

MongoDB on registry.centos.org and [MongoDB on Docker Hub]() are build from the same GitHub repo too. But in this case Docker Hub has a more recent version of Mongo.

I cannot tell why they are publishing different versions on different registries (maybe @hhorak can help us) but those images look good enough for our samples.

Now if you insist and want to maintain more Dockerfiles and CI for those images I won't block you @nickboldt but it looks like something we can avoid.

nickboldt commented 3 years ago

Questions:

nickboldt commented 3 years ago

The images you are mirroring from Docker Hub have 30+ vulns:

I never claimed I was using a better image, only that to allow the release to proceed, I'd copy the docker.io images to quay.

If you want to open a new issue for "move to newer/better/less vulnerable images" that's a different can of worms than "Move all docker images out of dockerhub", which is the title of this issue. :D

benoitf commented 3 years ago

in any case, core images should be multi-arch

hhorak commented 3 years ago

@l0rd The official community images for MySQL and MongoDB that we product are currently located on Docker Hub. The images on registry.centos.org were added with the aim to move them at some point, but it never happened entirely. I'd therefore recommend to use Docker Hub if possible, although we're aware of the limits that Docker hub recently introduced.

The solution @phracek from our team is working on to address the new limits is to move the images to quay.io.

Anyway, it is still not clear to me why the images on registry.centos.org are outdated, my understanding was they should be rebuilt pretty regularly using https://github.com/CentOS/container-pipeline-service. It's not something I'd consider worth exploring though, because of the plan to move to quay.io instead.

benoitf commented 3 years ago

FYI, app-sre images were single-arch but I've asked them to mirror all arches and they've created some PR to enable it: https://github.com/app-sre/sretoolbox/pull/22 https://github.com/app-sre/qontract-reconcile/pull/1203

Also for libraries of dockerhub we can ask app-sre to mirror them it if it's missing (so we could include traefik for example)

https://gitlab.cee.redhat.com/service/app-interface/-/tree/master/data/dependencies/image-mirrors/docker.io https://gitlab.cee.redhat.com/service/app-interface#create-a-quay-repository-for-an-onboarded-app-app-sreapp-1yml and the corresponding mirror configuration: https://gitlab.cee.redhat.com/service/app-interface#mirroring-quay-repositories

nickboldt commented 3 years ago

Does anyone know if the app-sre team can also mirror stuff from registry.redhat.io to quay.io?

Or, as asked on https://projects.engineering.redhat.com/browse/CLOUDDST-3798 ...

can we store a pull secret for registry.redhat.io in a GH action, which would allow authentication without the need for:

nickboldt commented 3 years ago

Turns out we can use a service principal's pull secret to pull images from registry.redhat.io... so we can just start using that when building Che components, when there exists a better option in RHEC than in quay.io/app-sre or docker.io.

https://gist.github.com/nickboldt/a884449ada7a6cdcda85e58c984b5eea

benoitf commented 3 years ago

@nickboldt but anyone should be able to build ( without any secret)

nickboldt commented 3 years ago

well, you need a docker auth secret to build and not hit rate limits (eg., for che-theia or che-server w/ dashboard and ws loader).

And getting a secret for reg.rh.io is just as free, and easy to do, as getting one for quay.io or docker.io: https://access.redhat.com/RegistryAuthentication

So... what's your REAL objection here?

benoitf commented 3 years ago

objection is that I don't log in into quay.io to build che (I'm far to reach any limits) as I don't share my ip. So I don't need to register on any system. I can clone from github without a github account, and build without a dockerhub account (and a redhat account)

nickboldt commented 3 years ago

Unclear when/how we'll close out the requirements here, so I've moved it to 7.x for now. If we're collectively satisfied with what's been done, we can resolve it for an appropriate milestone.

@sparkoo @l0rd @benoitf WDYT?

sparkoo commented 3 years ago

for me, if we're not blocked by dockerhub anymore, or if we've identified bottlenecks and created issues for them to work on, I'm ok with closing this one. I wasn't paying close attention to this issue, so I don't know exactly what has been done.

nickboldt commented 3 years ago

Closing, clearly nothing else happening here.