This dockerfile from Jupyterlab contains the installation and setup for spark, but is missing the installation of scala, which doesn't seem to be required. The get-spark-stuff.sh script in aaw-kubeflow-containers seems to pull these dockerfile and it is called in the makefile under generate-Spark. generate-Spark is not called anywhere when building the actual images, since we call generate-dockerfiles, which indicates that this isn't actually doing anything. We''ll want to add a call to generate-spark to produce the spark layer.
Scala can be installed with
pip3 install scala
or possibly with conda/mamba but not with the default channels (I believe bioconda has it).
The get-spark-stuff shell script is simply concatinating the dockerfiles, which I'm not really a fan off. The layer also doesn't work out-of-the-box because some upstream files are missing from our docker context.
A continuation of https://github.com/StatCan/aaw/issues/1867 dedicated for enabling spark development on jupyterlab notebooks.
Enabling Spark Development on AAW Notebooks
This dockerfile from Jupyterlab contains the installation and setup for spark, but is missing the installation of scala, which doesn't seem to be required. The
get-spark-stuff.sh
script in aaw-kubeflow-containers seems to pull these dockerfile and it is called in the makefile undergenerate-Spark
.generate-Spark
is not called anywhere when building the actual images, since we callgenerate-dockerfiles
, which indicates that this isn't actually doing anything. We''ll want to add a call togenerate-spark
to produce the spark layer.Scala can be installed with
pip3 install scala
or possibly with conda/mamba but not with the default channels (I believe bioconda has it).Adding Spark to the Dockerfiles
https://github.com/StatCan/aaw-kubeflow-containers/pull/551/files
The
get-spark-stuff
shell script is simply concatinating the dockerfiles, which I'm not really a fan off. The layer also doesn't work out-of-the-box because some upstream files are missing from our docker context.