jupyter-server / enterprise_gateway

A lightweight, multi-tenant, scalable and secure gateway that enables Jupyter Notebooks to share resources across distributed clusters such as Apache Spark, Kubernetes and others.
https://jupyter-enterprise-gateway.readthedocs.io/en/latest/
Other
615 stars 221 forks source link

How do I customize the distributed R kernel? #1176

Closed LeeMoonCh closed 1 year ago

LeeMoonCh commented 1 year ago

I use the following command: conda install -c r r-irkernel Then I get an R kernel in /usr/local/share/jupyter/kernels/ir directory. So how do I get this kernel to work in JEG?

kevin-bates commented 1 year ago

Hi @LeeMoonCh - thanks for your question.

So how do I get this kernel to work in JEG?

What kind of resource manager or configuration are you using EG to target? This answer will indicate the kind of process proxy that will be used to launch and communicate with your kernel. EG provides example kernel specifications for YARN, Docker, Kubernetes, and Distributed process proxies configured for Python (ipykernel), R (IRkernel), and Scala (Apache Toree) kernels. These are available in the various tar files bound to each release and can be built using make kernelspecs.

Please refer to our Operators Guide for deployment details relative to the configuration you're targeting.

LeeMoonCh commented 1 year ago

I want to use distributed.DistributedProcessProxy to run IR kernel. But I couldn't find it in the kernelspecs tar package. I want to implement a single IR kernel, not spark_R_yarn_client kernel. So do I need to write a launch_IRkernel.R and a run.sh myself? In addition, I also need to run a matlab kernel through processproxy. 🙂😊

kevin-bates commented 1 year ago

Thank you for the additional information. Although the spark_R_yarn_client kernelspec we provide uses the DistributedProcessProxy, it also assumes the existence of Spark and YARN, which you don't want. We only provide an example of python_distributed but your kernelspec (i.e., kernel.json) for R would be identical to that with the exception of how the R kernel launcher is invoked. Here's an example, based on python_distributed for a kernel.json file located in the r_distributed directory:

{
  "display_name": "R (distributed)",
  "language": "R",
  "metadata": {
    "process_proxy": {
      "class_name": "enterprise_gateway.services.processproxies.distributed.DistributedProcessProxy"
    }
  },
  "argv": [
    "Rscript",
    "/usr/local/share/jupyter/kernels/r_distributed/scripts/launch_IRkernel.R",
    "--RemoteProcessProxy.kernel-id",
    "{kernel_id}",
    "--RemoteProcessProxy.response-address",
    "{response_address}",
    "--RemoteProcessProxy.public-key",
    "{public_key}",
    "--RemoteProcessProxy.port-range",
    "{port_range}",
    "--RemoteProcessProxy.spark-context-initialization-mode",
    "none"
  ]
}

Note that when using the DistributedProcessProxy all configured hosts must have identical filesystem layouts relative to kernelspecs. So the file /usr/local/share/jupyter/kernels/r_distributed/scripts/launch_IRkernel.R needs to exist on all hosts at that same path and Rscript must be available in the PATH on each host for the ssh user.

Regarding the matlab kernel, we recently introduced the ability to support kernels that are subclasses of ipykernel in #1076, whose description happens to use an example for Matlab. I recommend you take a look there as well as this topic in our docs.

LeeMoonCh commented 1 year ago

@kevin-bates Thank you!

Regarding the matlab kernel, we recently introduced the ability to support kernels that are subclasses of ipykernel in https://github.com/jupyter-server/enterprise_gateway/pull/1076, whose description happens to use an example for Matlab.

It seems to be used with JEG3.0.Can I use launch_ipykernel.py directly on JEG2.6? 😄

kevin-bates commented 1 year ago

While not technically supported you should be able to use the 3.x launchers in 2.6. However, this particular parameter (--kernel-class-name) should not be prefixed with RemoteProcessProxy. in the argv list of the kernel.json file.

LeeMoonCh commented 1 year ago

@kevin-bates Thank you ! I will update JEG to 3.x.