jupyterhub / zero-to-jupyterhub-k8s

Helm Chart & Documentation for deploying JupyterHub on Kubernetes
https://zero-to-jupyterhub.readthedocs.io
Other
1.56k stars 798 forks source link

Proposal: New Proxy which mangages Kubernetes Ingress Resources #1642

Closed stv0g closed 2 years ago

stv0g commented 4 years ago

Hi all,

I am considering the implementation of a new proxy sub-class which would allow us to ged rid of the CHP:

https://github.com/stv0g/nginx-ingress

The idea is simple: jupyterhub_ingress_proxy would create / delete Kubernetes Ingress resources when instructed.

There are some minor quirks involved:

The whole state of the proxy would reside in the Kubernetes APIServer

We can add the route data as a JSON encoded label

Ingress resources must match a Service resource. The Proxy API only provides a URL as the target

But this can be resolved:

This article here describes the technique from above: https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-mapping-external-services

Alternatively, we might extend the Proxy API to include the singleuser pod name or hide it somewhere in the data attribute which gets passed to Proxy.add_route()

Are Ingress controllers fast enough to apply the new Ingress rules without a noticeable delay for the users?

Does somebody have experiences here?

manics commented 4 years ago

Sounds like an interesting idea!

FYI there's ongoing work to replace CHP with Traefik: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/pull/1162

There's also ongoing work to add internal_ssl:

stv0g commented 4 years ago

Hi @manics,

Yes, I am aware of the Traefik proxy. However, we do not use Traefik as our Kubernetes Ingres controller. As far as I see it, the Traefik CHP implementation does not rely on the "Kubernetes-native" Ingress resources.

So if I become successful with the new ingress-proxy, we could use it with all available Ingress controllers (including Traefik).

Thanks for pointing me to https://github.com/jupyterhub/kubespawner/pull/386. This would be very helpful indeed. I will add my thoughts / review for this PR later.

I am just a bit surprised that there has been no work in this direction as for it seems to be the most elegant solution for the proxy in Kubernetes.

stv0g commented 4 years ago

Actually, somebody already exactly implemented what I imagined:

https://github.com/jupyterhub/kubespawner/blob/master/kubespawner/proxy.py

The initial implementation is already 3 years old and has been done by @yuvipanda

@yuvipanda could you maybe provide some insight why this is not the default setup for z2jh?

remche commented 4 years ago

Hi,

I'm interested (and surprised) too. Actual traefik integration needs a CHP deployment. The goal would be to get completly rid of CHP.

I'm new to z2jh but I think we would need to :

@stv0g I'm ready to help on a PR if needed.

BertR commented 4 years ago

I bumped the models (going to ExtensionsV1beta first to remain compatible with Kubernetes 1.16, 1.17) in this PR https://github.com/jupyterhub/kubespawner/pull/402

In values.yaml I have

hub:
    extraConfig:
        00-proxy: |
            from kubespawner.proxy import KubeIngressProxy
            c.JupyterHub.proxy_class = KubeIngressProxy
            c.Proxy.should_start = False
stv0g commented 4 years ago

@BertR whats your impression regarding the latency which user encounter during startup?

Is it noticably larger than by using CHP?

BertR commented 4 years ago

@BertR whats your impression regarding the latency which user encounter during startup?

Is it noticably larger than by using CHP?

I haven't load tested this yet, but I think it will be very dependant on which ingress controller you have in your cluster (Nginx-ingress vs. AWS alb vs. contour/envoy vs. haproxy). If somebody is interested in setting up a test for this, let me know, I'm happy to help out!

remche commented 4 years ago

@BertR thanks for the work, I just tested with traefik. @stv0g I did not notice higher latency, but it's just my feeling on a small test.

FYI I build an k8s-hub image featuring the PR : remche/k8s-hub:ingress-extv1b1. I had to patch the hub Role to (get, watch, list, create, delete, patch) endpoints and services.

remche commented 4 years ago

If we enable ingress for jupyterhub, the ingress rules created by kubespawner proxy dont match anymore. routespec should mention host.tld, but I'm not sure how to do this.

remche commented 4 years ago

After further testing, I found a bug when using NamedServers : links for user named server does not have a trailing slash (ie user/username/notebookname) but ingress path has one (ie user/username/notebookname/).

@BertR do you want me to open an issue upstream ?

BertR commented 4 years ago

@remche yes, I'll try and have a look at both issues.

betatim commented 4 years ago

An aside: Welcome to Zero2JupyterHub 👋! It seems several off the people in this thread know each other and possibly communicate outside of this issue tracker with each other. It would be great if you could take a moment to introduce yourself over in https://discourse.jupyter.org/ and add some context on how you know each other etc. Even if you don't know each other, introducing yourself would be great and make you less "anonymous random stranger on the internet" :)

stv0g commented 4 years ago

Hi @betatim,

thanks for pointing me to the Jupyter discourse. I havn't known it yet :) You can find my (slightly lengthly) introduction here

remche commented 4 years ago

Hi @betatim, done too, just next Steffen ;)

remche commented 4 years ago

Bug reports opened ;) jupyterhub/kubespawner#404 jupyterhub/kubespawner#405 lmk if I can help.

remche commented 4 years ago

I worked on this lately and would be happy to hear your feedback 😉

consideRatio commented 2 years ago

This feature is what KubeIngressProxy is about I think. It is a class bundled inside the KubeSpawner project at the moment, and documented in https://github.com/jupyterhub/kubespawner/pull/568 to be something the JupyterHub team isn't actively working to maintain at this point in time.

I'll go for a close of this issue at this point, if someone creates a discourse.jupyter.org post about working towards this, please link this to this issue for future reference as well!