jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.56k stars 801 forks source link

Support usage of JupyterHub's internal_ssl functionality #1520

Open sstarcher opened 4 years ago

sstarcher commented 4 years ago

Currently, the proxy is implemented as a separate deployment. The proxy supports TLS, but terminates TLS at its pod and forwards unencrypted traffic over to the hub. This type of behavior is not optimal and for security reasons, it would be preferred to combine the proxy and the hub into the same pod.

Would you accept a PR to combine the proxy and the hub?

manics commented 4 years ago

The proxy and hub have different roles- the hub is responsible for authentication and managing the user's server, but after that communication goes directly between the proxy and the user's server, so they can't be combined. For instance, if the hub is upgraded the proxy can continue to direct traffic to users with a current session.

There's also a plan to replace the proxy with Traefik:

JupyterHub does support the use of SSL for internal traffic: https://jupyterhub.readthedocs.io/en/stable/reference/websecurity.html#encrypt-internal-connections-with-ssl-tls I don't think this chart supports it, but it sounds like a reasonable addition.

Probably a first step is to see if you can get it working using hub.extraConfig: https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/advanced.html#arbitrary-extra-code-and-configuration-in-jupyterhub-config-py

sstarcher commented 4 years ago

Thank you for the detailed description. I'll work on getting support in for SSL for the hub if it's not something that can currently be supported with the current chart.

sstarcher commented 4 years ago

@manics any advice on what I might be doing wrong in a very basic case.

hub:
  extraConfig:
    jupyterlab: |
      c.JupyterHub.internal_ssl = True
[I 2019-12-16 16:15:05.711 JupyterHub app:1428] Using existing hub-internal CA
[I 2019-12-16 16:15:05.711 JupyterHub app:1448] Using existing proxy-api CA
[I 2019-12-16 16:15:05.711 JupyterHub app:1448] Using existing proxy-client CA
[W 2019-12-16 16:15:05.743 JupyterHub app:1546] JupyterHub.hub_connect_port is deprecated as of 0.9. Use JupyterHub.hub_connect_url to fully specify the URL for connecting to the Hub.
[I 2019-12-16 16:15:05.790 JupyterHub app:1622] Not using whitelist. Any authenticated user will be allowed.
[I 2019-12-16 16:15:05.831 JupyterHub app:2255] Initialized 0 spawners in 0.008 seconds
[I 2019-12-16 16:15:05.834 JupyterHub app:2464] Not starting proxy
[I 2019-12-16 16:15:05.844 JupyterHub app:2500] Hub API listening on https://:8081/hub/
[I 2019-12-16 16:15:05.844 JupyterHub app:2502] Private Hub API connect url https://100.64.201.14:8081/hub/

When doing a kubectl port-forward locally I can run openssl and see the certificate, but when running curl -k https://localhost:8081/hub/api curl: (35) error:1401E410:SSL routines:CONNECT_CR_FINISHED:sslv3 alert handshake failure

And in the logs [W 2019-12-16 16:15:54.314 JupyterHub iostream:1407] SSL Error on 11 ('127.0.0.1', 43852): [SSL: PEER_DID_NOT_RETURN_A_CERTIFICATE] peer did not return a certificate (_ssl.c:852)

manics commented 4 years ago

I'm afraid I haven't used internal_ssl before. This was the original PR that added it https://github.com/jupyterhub/jupyterhub/pull/2055 @minrk @consideRatio Do you have any advice?

consideRatio commented 4 years ago

I know very little about the internal ssl setup. But in a k8s environment, if encryption between all pods is of importance, I would do suggest the use of Istio's mutual TLS possibilities.

sstarcher commented 4 years ago

Any reason you would specifically recommend going that route? As I understand it JupyterHub supports running each component separately and supports TLS between them all I would have imagined using something like istio's mutual TLS would have complicated the setup.

sstarcher commented 4 years ago

My understanding from reading through the code is the only way to turn on ssl for the Hub portion is to use the internal_ssl config.

consideRatio commented 4 years ago

Istio solves it properly for all pods communication, but it perhaps can be solved without that for z2jh pods.

I really lack the domain knowledge here though. I would be happy to understand more about this though and have lots of questions rather than answers about the internal_ssl configuration that I haven't yet read up on.

sstarcher commented 4 years ago

I'm entirely new to Jupyter so I have been working through the SSL config trying to see how to set up it up as the document seems to not have sufficient information in it.

minrk commented 4 years ago

The main thing needed for internal_ssl to work in z2jh is to distribute the credentials to the pods appropriately. The Hub needs to store them somewhere, and they need to get to the other pods. Ideally, the Hub would store them directly in kubernetes tls secrets, but I don't think this is possible with the APIs we currently have, they must be on disk for the hub container at least. KubeSpawner would then be responsible for configuring the single-user pods to load the credentials (such as by putting them in tls secrets). This will mean defining the KubeSpawner.move_certs method. Perhaps the most similar implementation would be the one in dockerspawner that creates a volume to put them in.

The last bit that will probably need updates in the chart here is figuring out the best way to bootstrap the initial certs that the Hub creates that need to be loaded into the proxy. In a local install, this is done with jupyterhub --generate-certs, which might be able to be an init container or something.

sstarcher commented 4 years ago

@minrk Thank you for your assistance and let me know what you think of the PR. https://github.com/jupyterhub/kubespawner/pull/386

sstarcher commented 4 years ago

So with my above PR my next issue is the dns/ip is being validated by the certificate and in the Kubernetes world the IP is dynamic. Any recommendations for handling the following. I would attempt to create a svc for the pod and have it use it's IP as the name of the service. Or does anyone have a better idea.

hub-8d9cc7897-ws5tp:hub [E 2019-12-25 19:01:09.263 JupyterHub iostream:737] Uncaught exception, closing connection.
hub-8d9cc7897-ws5tp:hub     Traceback (most recent call last):
hub-8d9cc7897-ws5tp:hub       File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 702, in _handle_events
hub-8d9cc7897-ws5tp:hub         self._handle_read()
hub-8d9cc7897-ws5tp:hub       File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 1472, in _handle_read
hub-8d9cc7897-ws5tp:hub         self._do_ssl_handshake()
hub-8d9cc7897-ws5tp:hub       File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 1391, in _do_ssl_handshake
hub-8d9cc7897-ws5tp:hub         self.socket.do_handshake()
hub-8d9cc7897-ws5tp:hub       File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
hub-8d9cc7897-ws5tp:hub         self._sslobj.do_handshake()
hub-8d9cc7897-ws5tp:hub       File "/usr/lib/python3.6/ssl.py", line 694, in do_handshake
hub-8d9cc7897-ws5tp:hub         match_hostname(self.getpeercert(), self.server_hostname)
hub-8d9cc7897-ws5tp:hub       File "/usr/lib/python3.6/ssl.py", line 327, in match_hostname
hub-8d9cc7897-ws5tp:hub         % (hostname, ', '.join(map(repr, dnsnames))))
hub-8d9cc7897-ws5tp:hub     ssl.CertificateError: hostname '100.114.20.137' doesn't match either of 'localhost', '127.0.0.1', 'hub', 'proxy-api', 'proxy-public', 'jupyter.dev.syapse.com'
hub-8d9cc7897-ws5tp:hub
hub-8d9cc7897-ws5tp:hub ERROR:asyncio:Exception in callback None()
hub-8d9cc7897-ws5tp:hub handle: <Handle cancelled>
hub-8d9cc7897-ws5tp:hub Traceback (most recent call last):
hub-8d9cc7897-ws5tp:hub   File "/usr/lib/python3.6/asyncio/events.py", line 145, in _run
hub-8d9cc7897-ws5tp:hub     self._callback(*self._args)
hub-8d9cc7897-ws5tp:hub   File "/usr/local/lib/python3.6/dist-packages/tornado/platform/asyncio.py", line 138, in _handle_events
hub-8d9cc7897-ws5tp:hub     handler_func(fileobj, events)
hub-8d9cc7897-ws5tp:hub   File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 702, in _handle_events
hub-8d9cc7897-ws5tp:hub     self._handle_read()
hub-8d9cc7897-ws5tp:hub   File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 1472, in _handle_read
hub-8d9cc7897-ws5tp:hub     self._do_ssl_handshake()
hub-8d9cc7897-ws5tp:hub   File "/usr/local/lib/python3.6/dist-packages/tornado/iostream.py", line 1391, in _do_ssl_handshake
hub-8d9cc7897-ws5tp:hub     self.socket.do_handshake()
hub-8d9cc7897-ws5tp:hub   File "/usr/lib/python3.6/ssl.py", line 1077, in do_handshake
hub-8d9cc7897-ws5tp:hub     self._sslobj.do_handshake()
hub-8d9cc7897-ws5tp:hub   File "/usr/lib/python3.6/ssl.py", line 694, in do_handshake
hub-8d9cc7897-ws5tp:hub     match_hostname(self.getpeercert(), self.server_hostname)
hub-8d9cc7897-ws5tp:hub   File "/usr/lib/python3.6/ssl.py", line 327, in match_hostname
hub-8d9cc7897-ws5tp:hub     % (hostname, ', '.join(map(repr, dnsnames))))
hub-8d9cc7897-ws5tp:hub ssl.CertificateError: hostname '100.114.20.137' doesn't match either of 'localhost', '127.0.0.1', 'hub', 'proxy-api', 'proxy-public', 'jupyter.dev.syapse.com'
hub-8d9cc7897-ws5tp:hub [E 2019-12-25 19:01:09.269 JupyterHub user:687] Unhandled error waiting for test's server to show up at https://100.114.20.137:8888/user/test/: hostname '100.114.20.137' doesn't match either of 'localhost', '127.0.0.1', 'hub', 'proxy-api', 'proxy-public', 'jupyter.dev.syapse.com'
sstarcher commented 4 years ago

Thanks, everyone for the assistance. My 2 above PRs resolve this issue for me.

chancez commented 4 years ago

I'm probably going to take over #1535 and begin updating it to fully handle provisioning certificates in a helm hook Job that will create secret that can be mounted by the hub & proxy pods before they startup. Will still depend on https://github.com/jupyterhub/kubespawner/pull/409 however.

robertpyke commented 3 years ago

Is anyone still actively working on this? It looks like a few of the dependencies were in progress at the time of the last update, but some have since been merged and others closed. It’s a little hard for me to follow the state of this.

I’d need to ramp up on a fair few things, but I may have some time to work on this, depending on where it’s gotten to, and what’s pending.

sstarcher commented 3 years ago

I am no longer working on this so I can't speak to the current progress. At the time when I stopped working on it the PRs were all functional and the company I was working for was using it in production for over a year now.

consideRatio commented 3 years ago

This Helm chart has not been updated to work with https://github.com/jupyterhub/kubespawner/pull/409, but it is the next step. I don't overview the details on how to do this, but i think the feature enabled would mostly make sure various certs are mounted on the pods where its relevant etc.

sstarcher commented 3 years ago

Here is the PR I put in against this repo that I closed out - https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1535/files

tfmark commented 1 year ago

We are looking to replace our old z2jh (0.11.1) setup with 3.0 once it's released. We cherry picked portions of the #1535 PR to ensure we have SSL in place. Can anyone confirm what the state of SSL in master is? I see that https://github.com/jupyterhub/kubespawner/pull/409 has been merged - what's left to close out this issue (I guess my real question is: how much work will I have to do to get SSL working with 3.0 😄).