jupyter-server / enterprise_gateway

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
https://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Other
620 stars 223 forks source link

Is there any way to limit all kernel resources now? #1075

Closed LeeMoonCh closed 2 years ago

LeeMoonCh commented 2 years ago

Description

If I use JEG to open a python_distributed kernel, is there any way to limit the resources used by the kernel? For example:

I want the kernel to most use 5GB of memory and 2 cores.

770 It is restricted at k8s.

kevin-bates commented 2 years ago

Hi @LeeMoonCh,

is there any way to limit the resources used by the kernel?

No, not at this time. This would likely be addressed when the ecosystem introduces parameterized kernels, but it's been tough to get traction on that.

As you're aware, the Kubernetes kernel environment can support resources like this because each kernel is associated with a kernel-pod template file and because KERNEL_-prefixed envs can flow from the client through EG to the kernel, they can behave like parameters.

Given that KERNEL_-prefixed envs flow, and if you know how to start a process specifying these kinds of resources and those parameters can be specified on the command line, then you might be able to launch the kernel in that manner. Since the default kernel launchers don't know that kind of thing, I would imagine this would require some kind of process wrapper or shell, but we don't provide such functionality outside of Kubernetes. Sorry about that.

LeeMoonCh commented 2 years ago

Thank you! @kevin-bates
I think I could do this by modifying the startup script. Passed through parameter variables.

kevin-bates commented 2 years ago

If you find a solution that fits nicely with the implementation please consider contributing it. It could provide a great datapoint relative to parameterized kernels.

LeeMoonCh commented 1 year ago

I think we can introduce cgroups to limit resource usage. This requires the model to include a pid. This idea is better implemented in jupyter server. I'll give it a try locally now.