krishnan-r / sparkmonitor

Monitor Apache Spark from Jupyter Notebook
https://krishnan-r.github.io/sparkmonitor/
Apache License 2.0
172 stars 55 forks source link

Integration with nteract #14

Open krishnan-r opened 5 years ago

krishnan-r commented 5 years ago

nteract is a frontend for Jupyter that runs natively on the Desktop, on the web and in other places like within the Atom editor.

The nteract desktop app runs on electron and is implemented using React, Redux, RxJs, along with other libraries, also using typescript.

The goal of this feature is to implement support for SparkMonitor into nteract directly providing a seamless user experience for using Spark from nteract.

Work needs to be done on refactoring SparkMonitor to support nteract, improve jupyter protocol support for this use case and support Scala kernels.

This issue summarizes discussions on Slack with @rgbkrk and others at the nteract/spark_integration channel

krishnan-r commented 5 years ago

Work in Progress

A preliminary demo implementation with React + Redux + TypeScript is on the way here: https://github.com/krishnan-r/sparkmonitor/tree/dev, It works with Jupyter Notebook at the moment, but supporting nteract would be straight forward once the necessary refactoring is done.

The idea is that, data received from a SparkListener is stored in a Redux store on the frontend. Actions are dispatched on comm messages that update the store. Other events like related to notebook cells also affect the state.

Below each cell progress bars are displayed using a store connected React Component that takes a cell_id and tracks all jobs spawned from that cell. More complex visualizations also can be implemented the same way.

krishnan-r commented 5 years ago

SparkMonitor works using SparkListeners, Kernel extensions, widget comm API and frontend extensions as described here

Instead of using the widget comm API, another alternative is to use the display and update_display function in IPython. This combined with VDOM can render progress bars with complex HTML. However running arbitrary JavaScript is still a challenge.

Another potential feature yet to be implemented that can be used is: https://github.com/jupyter/jupyter/issues/264

Much research and work on improving Jupyter Protocol support for Spark is on the roadmap here

rgbkrk commented 5 years ago

👀 @willingc @captainsafia @alexarchambault @mpacer @mseal

Discussion from slack brought over to here.