jupyter-server / enterprise_gateway

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
https://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Other
615 stars 221 forks source link

Handle the case when pod and/or pvc has already been created #1202

Closed luong-komorebi closed 1 year ago

luong-komorebi commented 1 year ago

I am not entirely sure this is the best way to go, as well as I am aware we should keep the original script minimal to encourage downstream customizations.

However, I find it confusing when out of the box, with the default kernel-pod.yaml.j2 (and kernel_volume enabled), we cannot have idempotency and jupyter EG will just return error because given the same kernel id, the same pod and pvc cannot be connected from the client side.

This pull request shows my little adjustments to improve that. And while of course it can be applied to other resources, I would propose just keep it as minimal as possible and PVC as well as pod proves to be great targets (they can also be cleaned up by terminate_container_resources inside k8s.py process proxy)

kevin-bates commented 1 year ago

I find it confusing when out of the box, with the default kernel-pod.yaml.j2 (and kernel_volume enabled), we cannot have idempotency and jupyter EG will just return error because given the same kernel id, the same pod and pvc cannot be connected from the client side.

I totally agree with this approach and am surprised this hasn't been brought up before. (I don't use volumes in my day-to-day usage, but probably should!). FWIW, we do trap 409 when attempting to create the kernel pod's namespace, but tolerate its existence when the kernel is restarting.

Thank you for another great contribution!

kevin-bates commented 1 year ago

Another point for future improvement if I have the time is to write some simple unit test for this script

Yes, that's another thing we're lacking.

Btw, I wanted to bring this repo to your attention. Kernel Provisioners essentially replace Process Proxies and are baked into the general Jupyter framework. They are nearly identical to process proxies (process proxies were essentially a proof of concept of kernel provisioners) as you can notice. In addition, the launch scripts are identical (except any environment variables are prefixed differently). EG 4.0 will have its process proxies and most all of its etc hierarchy removed and will, instead, leverage Remote Provisioners.

I bring this up because most items added to this repository, particularly anything in the services/processproxies or etc hierarchies also apply to https://github.com/gateway-experiments/remote_provisioners. So, if you're interested in applying similar changes/contributions there (until EG 4.0 exists), it would be greatly appreciated. If not, a maintainer will migrate applicable changes and add you (or whoever made the contribution) as a co-author. FYI.