Open consideRatio opened 4 years ago
cc @betatim who was just on an issue where I mentioned Erik working on security stuff. Tim - this is a "meta" repository for the "Jupyter Meets the Earth" project and some conversation about JupyterHub/Binder-related things might happen here, though in general we will try to keep issues etc in the proper JupyterHub repository. Just wanted to let you know about this repo's existence :-)
This is awesome content, @consideRatio!
I want to suggest an alternate approach. If you see https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/master/jupyterhub/templates/proxy/autohttps/configmap.yaml#L104, we are defining just one router + backend, pointing to CHP. However, it is trivial to add some templating there for extra routers and extra backends. So the z2jh config could take a list of domain names + their backends (as service names or just DNS entries), and add them to the traefik config. This will make traefik aquire HTTPS certs for all of them, and do some routing as well. This should cause less work, and make it really easy for users of z2jh to use for other external services they might have.
What do you think of this?
:wave:
Two points: 1) cert-manager on mybinder.org (finally) just worked (for staging, haven't had time to enable it for prod) 2) I support Chris' suggestion that if you want to discuss architecture/plan changes for binderhub it is probably best to move the conversation there with enough time for the people who hang out there to chime in and give input
A question: is it possible to use the traefik that does LE for JupyterHub to also cover other Ingress
objects in the k8s namespace? Thinking of mybinder.org where we have several ingress objects that needs certs but aren't part of the binderhub or z2jh chart. The reason for moving forward with cert-manager there was that I think we need something like it tocover all these "extra" ingress objects.
@yuvipanda @betatim @choldgraf this topic was to a large extent a rubber ducking exercise that got out of hand and ended up being technically relevant for repos like z2jh and binderhub. I'll ensure to bring discussion there going onwards.
I want to suggest an alternate approach.
@yuvipanda sounds good to me! I'm going for it!
- cert-manager on mybinder.org (finally) just worked (for staging, haven't had time to enable it for prod)
@betatim :tada: yepp! nginx-ingress
+ cert-manager
is the more reliable and scalable solution that for example could avoid causing disruptions thanks to the ingress proxy pods are HA and the acquisition of certificates can work well still. This would be the more lightweight default solution that can help a new binderhub admin quickly get up and running.
A question: is it possible to use the traefik that does LE for JupyterHub to also cover other Ingress objects in the k8s namespace? Thinking of mybinder.org where we have several ingress objects that needs certs but aren't part of the binderhub or z2jh chart. The reason for moving forward with cert-manager there was that I think we need something like it tocover all these "extra" ingress objects.
@betatim yepp I think Traefik can do this, to configure itself to route traffic arriving to it given ingress resources in Kubernetes, and I think it can also acquire certificates for these resources. But, unlike a nginx-ingress + cert-manager setup, Traefik's open source version cannot work in HA and acquire certificates for the ingress routes.
Helm charts are hard to maintain because it requires to plan for everything that someone may want to modify, which is impossible. This PR is about setting us up to avoid that in preparation for the coming related PRs.
This infrastructure doesn't modify our current tests, but it allows us to mimic having Let's Encrypt in our CI system and the process of acquiring HTTPS certificate as well as using them.
This is still ongoing. The effort so far...
What remains is to update https://github.com/jupyterhub/binderhub/pull/1101 now that the pieces are in place, which also requires a bit of CI updates of the BinderHub infrastructure as well.
I have not dropped the ball on this, but my attention isn't fully focused on it. Here is the current status of things. Since last status update...
Future
I lost momentum waiting for step-along-the-way PRs to be merged and didn't get back on it. I have https://github.com/jupyterhub/binderhub/pull/1179 open still, but it isn't a road block I think.
This work can continue at this point.
Goal
To make it easy to setup BinderHub's Helm chart to easily use HTTPS, which means that network traffic between the user and binderhub will be encrypted.
History
We have used kube-lego to acquire certificates, but it was deprecated and could not comply with a new requirement by Let's Encrypt that we interact with. We have considered kube-lego's successor called cert-manager, but we considered doing so came with a bit too much overhead. This is why we now want a more lightweight solution.
Here is a related issue about not being able to use kube-lego: https://github.com/pangeo-data/pangeo-binder/issues/127
Theory 101
For a BinderHub user to establish a secure communication (HTTPS) with a BinderHub server at binder.example.com, some things need to happen first.
Choose a CA (Let's Encrypt) We need a common trusted party, a Certificate Authority (CA). Let's Encrypt is such CA that is well trusted and free to use.
Prove domain ownership -> acquire signed domain certificate The CA can give away a signed domain certificate acting as proof the domain owner needs later, but only to a domain owner that can prove its ownership of the domain to the CA. This is where the ACME protocol is useful. BinderHub can ask the CA for a http01 challenge to prove it. During the http01 challenge, BinderHub will need to respond in a specific way to binder.example.com, which help the CA be confident it is control of the server responding to binder.example.com.
Encrypt HTTP / Decrypt HTTPS BinderHub needs to encrypt/decrypt all outgoing/incoming traffic. This is called TLS termination and can be done in a standalone manner by a TLS termination proxy or by the webserver serving the BinderHub content. This step requires the certificate we acquired in the previous step.
Additional reading
Z2JH's solution
I want to reuse as much as possible from the Z2JH solution implemented in https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1539 by @yuvipanda. Assuming Z2JH was configured to use this solution, the following would happen.
Z2JH's Kubernetes Service: proxy-public
BinderHub's planned solution
I plan to either reference or duplicate this code, and make minor changes to help BinderHub specifically.
.Values.jupyterhub.proxy.traefik.image.name
for example.BinderHub's Kubernetes Service: binder
The Auto HTTPS part is new, the other is left unchanged.