mozilla / jupyter-spark

Jupyter Notebook extension for Apache Spark integration
Mozilla Public License 2.0
193 stars 34 forks source link

Doesn't work with multiple notebook kernels (or multiple spark contexts) #36

Open mdboom opened 6 years ago

mdboom commented 6 years ago

As reported by @dmvieira in this comment, the plugin doesn't work correctly if there are multiple notebook kernels running (or multiple spark contexts in the same notebook), since the plugin is essentially designed with one running spark context/instance in mind.

That said, that's a tricky issue to solve in the general case. It would require examining the pyspark objects in the cell to see which spark instances they are connected to.

Alternatively, best advice is probably to start the spark instance manually outside of Jupyter/Pyspark, and then write the pyspark calls to use that instance rather than creating their own instances. (This is essentially the same advice for running against a remote spark). Probably can solve that with a little documentation.

mdboom commented 6 years ago

Oops... This is a duplicate of #22.