canonical / charmed-spark-rock

This repository contains the packaging metadata for creating a ROCK for Apache Spark
1 stars 7 forks source link

ROCK will need to have configurable notebook-dir to avoid data loss in Kubeflow #67

Open kimwnasptd opened 8 months ago

kimwnasptd commented 8 months ago

Creating an issue after our findings with @deusebio, for integrating the PySpark ROCK with Kubeflow.

In Kubeflow when a new Notebook Pod gets created the default behaviour is to mount a PVC under the, hardcoded, /home/jovyan. We'll need to have a mechanism for the ROCK to be able to configure the --notebook-dir arg of jupyterlab, so that we can set it accordingly for Kubeflow.

Example of how this is set in upstream https://github.com/kubeflow/kubeflow/blob/master/components/example-notebook-servers/jupyter/s6/services.d/jupyterlab/run#L9

Making the code read an ENV variable should be good enough, and then from Kubeflow side we can create a PodDefault that would handle updating that variable