jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.56k stars 390 forks source link

AWS ECR registry for BinderHub deployment #705

Open nsriram13 opened 6 years ago

nsriram13 commented 6 years ago

I am trying to use ECR as the docker image registry for a BinderHub deployment. I was looking at the different settings in the helm chart and I am not sure how they map from the GCR examples provided in the docs to ECR.

When explicitly pushing images to a registry, this is the command I use for ECR:

docker push 1234567890.dkr.ecr.region.amazonaws.com/repo:tag

How to translate this into the various components requested in the values file. Specifically the following in the values that are being set in configmap.yaml.

registry:
  enabled: true
  prefix: binderhub-local/
  host: https://gcr.io
  authHost:
  authTokenUrl: https://gcr.io/v2/token?service=gcr.io
  username: _json_key
  password:

For authentication, the nodes in our Kubernetes cluster are allowed to pull from ECR as they have IAM roles configured. Can I leave the authentication options blank here.

Really appreciate the guidance.

quazzuk commented 5 years ago

I’m unable to reverse engineer the various config settings for ECR using the provided documentation.

Anyone know the answer to the above?

betatim commented 5 years ago

The BinderHub team itself doesn't have much experience with AWS as the deployment we run (mybinder.org) is on GCE. Maybe post this question to http://discourse.jupyter.org/ as well in the hopes that someone who has deployed a BinderHub on AWS (or has experience with AWS) sees it.

If there is an answer or you manage to work it out it would be great to add to the documentation so that others can find it more easily. Maybe instead of starting with a complete guide just documenting how to do this would be a way to kickstart the writing of docs on deploying BinderHub on cloud hosters that aren't GCE.

chicocvenancio commented 5 years ago

I have managed to get BinderHub to successfully use ECR as a Docker Registry. There are two "core" issues and some overriding of DockerRegistry methods.

  1. We need to import boto3. (currently we can do this in a new image or in a postStart Lifecycle hook, as I have been doing for development).
  2. There is no way to override the DockerRegistry class in JupyterHub as is. ( I think we should allow for custom classes to be configured by the user).
  3. There are two Gotchas with ECR that need handling in overriden methods: passwords are valid for 12 hours and repositories need to be created before pushing the first image through repo2docker. I managed to do both by using boto3 to get the password and create the repos as needed and kubernetes to set the password in the push_secret secret to allow use by repo2docker in the build pod.
betatim commented 5 years ago

Adding a AWSDockerRegistry class and making it possible to choose from the helm chart would be a good addition. I think we could add boto3 as a dependency (or is it huuuge?), probably not worth making some conditional import thing.

It is probably also worth adding some documentation how to map from AWS instructions/lingo to what BinderHub uses.

Would be great to have support for BinderHub-on-AWS with all the bells and whistles.

chicocvenancio commented 5 years ago

boto3 itself is only a 128KB wheel. Botocore is a requirement that adds some 5.6MB. With all requirements added by boto3 it should come to 6.5MB. I'll clean up my code and commit in that PR so we can discuss.

ivan-gomes commented 5 years ago

We could really benefit from having ECR support added. PR #920 should satisfy it. Anything else necessary to get it rolled in @betatim?

btjones-me commented 3 years ago

Hi folks, are we still planning on resolving this one? I notice most of the work is completed, would love to see ECR integrated

oyamin commented 2 years ago

Any updates when ECR integration will be available?

yuvipanda commented 8 months ago

@manics does this work now, given we have a mybinder.org federation member on AWS?

manics commented 8 months ago

Yes! It requires https://github.com/manics/binderhub-container-registry-helper to be deployed. This increases the deployment complexity, but avoids needing vendor specific requirements and code in BinderHub, simplifying maintenance and testing.

Though you've reminded me I need to revisit https://github.com/jupyterhub/binderhub/pull/1637 which acts as the interface between BinderHub and binderhub-container-registry-helper Currently the registry class is extended in the mybinder.org extra config