jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters
Other
1.33k stars 447 forks source link

Add support kind=shared to support Livy 0.5 #450

Open ying1 opened 6 years ago

ying1 commented 6 years ago

Livy 0.5 has a new feature that enables multiple interpreter support in one session: https://github.com/apache/incubator-livy/commit/c1aafeb6cb87f2bd7f4cb7cf538822b59fb34a9c

Documentation: https://livy.incubator.apache.org/docs/latest/rest-api.html#session-kind

This seems to cause an issue with spark magic in which if the livy server has a shared session running, the "Add Endpoint" would fail with :

BadUserDataExceptionTraceback (most recent call last)
/opt/conda/lib/python2.7/site-packages/hdijupyterutils/ipywidgetfactory.pyc in submit_clicked(self, button)
     63 
     64     def submit_clicked(self, button):
---> 65         self.parent_widget.run()

/opt/conda/lib/python2.7/site-packages/sparkmagic/controllerwidget/addendpointwidget.pyc in run(self)
     63         # We need to call the refresh method because drop down in Tab 2 for endpoints wouldn't refresh with the new
     64         # value otherwise.
---> 65         self.refresh_method()
     66 
     67     def _show_correct_endpoint_fields(self):

...<other stacks>

/opt/conda/lib/python2.7/site-packages/sparkmagic/livyclientlib/sparkcontroller.pyc in _livy_session(http_client, properties, ipython_display, session_id)
    109                       session_id=-1):
    110         return LivySession(http_client, properties, ipython_display,
--> 111                            session_id, heartbeat_timeout=conf.livy_server_heartbeat_timeout_seconds())
    112 
    113     @staticmethod

/opt/conda/lib/python2.7/site-packages/sparkmagic/livyclientlib/livysession.pyc in __init__(self, http_client, properties, ipython_display, session_id, spark_events, heartbeat_timeout, heartbeat_thread)
     88         if kind not in constants.SESSION_KINDS_SUPPORTED:
     89             raise BadUserDataException(u"Session of kind '{}' not supported. Session must be of kinds {}."
---> 90                                        .format(kind, ", ".join(constants.SESSION_KINDS_SUPPORTED)))
     91 
     92         self._app_id = None

BadUserDataException: Session of kind 'shared' not supported. Session must be of kinds spark, pyspark, pyspark3, sparkr.

Which traces down to: https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/sparkmagic/utils/constants.py

thesuperzapper commented 6 years ago

@aggFTW (Not sure who's maintaining this repo)

I think the way to do this would be similar to Zeppelin.

E.g. %%spark.python for a python statement, and %%spark.scala for a scala statement. (Allowing users to set a default language which can be called as %%spark)

This would also require us change the %%sql implementation, but would stop us having to spin up a new session for sql queries.