FRosner / spawncamping-dds

Data-Driven Spark allows quick data exploration based on Apache Spark.
Other
28 stars 15 forks source link

DDS integration into Zeppelin #190

Open RPCMoritz opened 9 years ago

RPCMoritz commented 9 years ago

Zeppelin is quickly becoming the most popular Notebook-like environment for Apache Spark.

A Google Summer of Code project is currently in progress to enable visualization library support for Zeppelin, and a custom interpreter could be used as the basis for providing DDS functions on existing structures in Zeppelin's global SparkContext. We have to coordinate with Madhuka Udantha and his mentors to see how they plan to open up and modularise the visualization component, and how to pass DDS-components into this interface.

Alternatively, the DDS jar would have to be integrated into the classpath for the spark interpreter, and/or dependency-loaded and used in a more ad-hoc manner.

FRosner commented 9 years ago

@RPCMoritz thanks for proposing this. Could you please provide a link to the corresponding Zeppelin JIRA issue? As soon as you have a better understanding of what it takes from our side to make this happen, please provide a small design document (maybe in this issue description).

I moved it to backlog for now as we don't know when it is going to happen. We will schedule it accordingly once we have some more detailed information.

RPCMoritz commented 9 years ago

I just found the following information, which may provide another way of interacting with Zeppelin:

"If table contents start with %html, it is interpreted as an HTML." from https://zeppelin.incubator.apache.org/docs/display.html It may be possible to abuse or extend this to enable a DDS interpreter to redirect webserver output to standard-out in a compatible format. Currently a limitation to this approach is the requirement of the wrapping table environment.

The JIRA for the pluggable visualization approach can be found here: https://issues.apache.org/jira/browse/ZEPPELIN-107 The corresponding progress is also being tracked here: https://cwiki.apache.org/confluence/display/ZEPPELIN/COMDEV-119+Zeppelin+GSoC+Project%3A+add+more+D3+visualization

FRosner commented 9 years ago

@RPCMoritz so what do you propose?

RPCMoritz commented 9 years ago

Further research into the issue. In fact the PR linked to in the JIRA (https://github.com/apache/incubator-zeppelin/pull/27) also gives access to an angular interpreter which we may be able to encapsulate/extend. I'll take a closer look at the individual capabilities ASAP.

FRosner commented 9 years ago

Thanks @RPCMoritz

RPCMoritz commented 9 years ago

I just figured out that the table-wrapping isn't necessary for html rendering. Assuming it is fully featured, we can probably integrate DDS that way, if we can intercept the generated html and write it out, instead of serving it.

FRosner commented 9 years ago

If there are any changes required from the DDS side, let's discuss them here and eventually create a ticket. We can put them as subtasks to this one.

RPCMoritz commented 9 years ago

Another thing to consider is the multi-user capability of Zeppelin (only one sc per Zeppelin instance, but multiple browsers possible) which may impose some modifications to make some naive integration options transparently possible.

FRosner commented 9 years ago

Yes. It might mean that we need to change the way servables are kept in the server. Can notebooks be saved / reopened later? If so, we need to think about something that allows us to do it. Best would be that the resulting HTML is saved in Zeppelin. This would not be possible if we used an iframe solution.

On 24 Jul 2015, at 11:12, RPCMoritz notifications@github.com wrote:

Another thing to consider is the multi-user capability of Zeppelin (only one sc per Zeppelin instance, but multiple browsers possible) which may impose some modifications to make some naive integration options transparently possible.

— Reply to this email directly or view it on GitHub https://github.com/FRosner/spawncamping-dds/issues/190#issuecomment-124451116.

RPCMoritz commented 9 years ago

The Iframe "solution" (it's more of a hack, really) appears to work, but has the downside of not being very notebook-friendly (the display isn't statically linked to the code). It could be good enough to allow intermittent demo-style usage (no need to go to the shell anymore) but long term we should aim for static rendering. iframe

RPCMoritz commented 9 years ago

I will look at the angular interpreter next, to see if we can write a DDS-interpreter, which packages the js and interpretes Servables.

FRosner commented 9 years ago

:+1:

FRosner commented 9 years ago

@RPCMoritz In order to allow saving it we might be able to store the JSON serialized servable. But then we need JavaScript code in the front-end that has all the required libraries etc. to be able to render this servable.

However, settings you make in the visualization (changing scales etc.) will not be saved, because they are not reflected in the JSON so far. But this is something we could work on.