This is a container image intended to make it easy to run Jupyter notebooks with Apache Spark on OpenShift. You can use it as-is (by adding it to a project), or you can use it as the basis for another image. In the latter case, you'll probably want to add some notebooks, data, and/or additional packages to the derived image.
For your convenience, binary image builds are available from Docker Hub.
radanalyticsio/base-notebook
to an OpenShift project.JUPYTER_NOTEBOOK_PASSWORD
in the pod environment to something you can remember (this step is optional but highly recommended; if you don't do this, you'll need to trawl the logs for an access token for your new notebook).nbuser
(uid 1011), add notebooks to /notebooks
and data to /data
.Make sure that this notebook image is running the same version of Spark as the external cluster you want to connect it to.
This image was initially based on Graham Dumpleton's images, which have some additional functionality (notably s2i support) that we'd like to incorporate in the future.