metasoarous commented 10 years ago

Having spent a few years now working with ggplot2 in R, I'm really missing the ability to create aesthetic mappings while using gorilla-repl (particularly for color and size). The plot functions as they stand don't leave room for such mappings, so there would need to be another function (is ggplot too direct? Maybe aesth-plot or map-plot?) which takes a collection of maps and aesthetic mappings.

Is this something you would consider adding to gorilla-repl itself? Or would you rather have it as a separate library? Either way, I would like to contribute towards this effort, and would love to get feedback from the community.

ghost commented 10 years ago

I think it would be super cool to have a library that mimics ggplot for gorilla repl.

Imho having it externally seems to benefit gorilla repl by keeping it small and the ggplot lib by making it more self governed.

Give me a call when you start something I’ll happily contribute :).

I’m currently working on a gorilla extension to create interactive renders with clojurescript. It is the foundation some interactive plots we use internally so maybe I should start documenting it ;)

Cheers Jan

On 21 Sep 2014, at 23:52, Christopher Small notifications@github.com wrote:

Having spent a few years now working with ggplot2 in R, I'm really missing the ability to create aesthetic mappings while using gorilla-repl (particularly for color and size). The plot functions as they stand don't leave room for such mappings, so there would need to be another function (is ggplot too direct? Maybe aesth-plot or map-plot?) which takes a collection of maps and aesthetic mappings.

Is this something you would consider adding to gorilla-repl itself? Or would you rather have it as a separate library? Either way, I would like to contribute towards this effort, and would love to get feedback from the community.

— Reply to this email directly or view it on GitHub.

metasoarous commented 10 years ago

Wonderful!

I'm excited to hear you mention interactive features! One of the things I was actually thinking of with this ggplot idea was "wouldn't it be nice to have a 'hover-over' aesthetic mapping?" There are other sides of what I'm doing that could definitely benefit from interactivity as well.

daslu commented 10 years ago

Exciting to hear about your ideas,

it might be a good idea to learn from ggvis (https://github.com/rstudio/ggvis). This is a successor of ggplot2 which works in vegajs format (like gorilla-plot likes to do).

metasoarous commented 10 years ago

Beautiful! Thanks for sharing. That will be quite helpful to look at moving forward.

JonyEpsilon commented 10 years ago

It would be absolutely great to have a better plotting library, and while I've not used it, I understand ggplot2 is very highly regarded, so would be a good target for compatibility.

I agree with @ticking that it would be better as a separate library. I'd be happy to contribute to the effort, and also to make any changes to Gorilla that are needed to support it.

With regard to interactivity, that's something that's been on my mind for a while. I've got a few ideas kicking around on either offering a "watch" type facility where an output can be tied to an atom, and updated whenever the atom updates. Or maybe a sort of routable println using a channel. It would be good to see what you've been up to @ticking, if it's possible for you to share. Implementing interactivity elegantly is an important thing to get the foundation right for, and would be interesting to discuss. From reading the "readme" of ggvis it sounds like this is something that the R people have already thought about, so we should factor in their experience too.

I should study ggvis and see how it all fits together, as it sounds close to what we might ultimately want, and there would surely be benefits from trying to be compatible. I'll try and add something more useful to this discussion once I've done my ggvis homework! Let me mention in @hadley in case he has anything he'd like to add to this thread :-)

ghost commented 10 years ago

@JonyEpsilon Yeah I agree that it's important to get right :D, the things I've done so far are already public but still in stealth mode, hence no docs. I'll probably write something together and push that later today.

Here are the things that were important to me:

Be a library. No changes to gorilla repl required. (Remember #143 ;D?)
Preserve the ability to export the document as static html while still retaining interactivity in said document. This requires all computation to be done browser side.
Be able to write that logic in Clojure(Script).
The visualisation should be able to two way communicate with the running clojure program, if absolutely necessary.

The way I've approached this is by having a function that takes quoted cljs, the data to be rendered, which will be embedded into the generated html and an optional core.async channel which will deliver a web socket connection once established.

The cljs code can then access the data via marmoset.client/env and the channel as marmoset.client/chan, and will be rendered in its own iframe. (This requires some css trickery so look seamless in all browsers, but will get a lot easier once the seamless attribute gets implemented more widely.)

Having I frames is cumbersome but solves the following problems:

cljs namespaces won't be able to break each other, otherwise two renders of the same kind would share global variables.
Rendering js won't be able to break the rest of the notebook.
Resources are actually collected on segment reevaluation, I have not found another way to close and gc the web socket connecting the browser and server.

There is also a macro that can be used to view cljs, and is just a thin wrapper around the rendering function.

(ns marmoset.example
  (:require [marmoset.core :refer :all]
            [clojure.core.async :as async]))
(def ch (async/chan))

(cljs-view {:chan ch
            :env {:x "hello, world"}}
      (ns foo
        [:require marmoset.client
                  [cljs.core.async :refer [<!]]]
        [:require-macros
         [cljs.core.async.macros :refer [go]]])
      (enable-console-print!)
      (println marmoset.client/env)
      (go (loop []
            (when-let [m (<! (:in marmoset.client/chan))]
              (js/alert (str m))
               (recur)))))

(def c (async/<!! ch))
(async/put! (:out c) "hello channel")

metasoarous commented 10 years ago

Great!

As far as "interactivity" goes, I was mostly imagining somewhat "canned" client side stuff. You've clearly got a much bolder and broader vision here, and I like it :-)

How decouplable do you think marmoset is from either the existing gorilla-plot functionality or a hypothetical ggplot clone that modelled it's rendering closely after gorilla-plot? Do you think these should be separate libraries that try to accommodate each other, and if so how much additional complexity is incurred by this separation?

ghost commented 10 years ago

I'm not sure I understand the question correctly. Marmoset currently only provides functions to render clojurescript as html segments and connect websockets once they are embedded into gorilla. The library is only a few hundred lines of code and won't probably grow that much even with macros that make it easier to use. (I'll implement something like defcljs that will precompile the source and then just take an env and an channel to minimise cljs compilation overhead.)

In a custom plot lib setting one would define records that implement the Renderable interface as usual, and then return the html generated by marmoset as an :html type. Sadly this means that these plots can't be compatible with gorilla.plot compose, as these are always of the VegaView type.

To be honest I think even though it is nice that gorilla has vega support, I kinda dislike it.

Vega seems pretty unmaintained.
Some things can't be implemented at all without major preprocessing (see histograms).
The use of VegaView means that compose is a closed operation, and can't be extended.
The pervasive use of VegaView means that meaningful information is prematurely lost and replaced by a rather complex rendering instruction.

I think compatibility between multiple plot libs could be achieved though by making compose a multi method, and by having one Record type for each different plot. Library authors could then provide compatibility code for each library that should interact. The downside of this is though that this would require a lot of hand coding and it seems rather difficult in comparison with merging vega data.

JonyEpsilon commented 10 years ago

@ticking This looks really interesting. I'm looking forward to having a play around when I have a bit of time (term starting soon here, so very busy :-( )

Some general remarks:

It's great that marmoset functions as a library, as then free experimentation can be done without destabilising the core of Gorilla. That said, I do think there's a place in "core" Gorilla for something to enable interactivity. As I said, I had a rough idea in terms of watching atoms, but I'm certainly not fixed on that, and it'll be interesting to see how marmoset works out in practice. Maybe when I'm a bit less busy we can have a general discussion about what might be the right way to go?

With regard to the original topic of this issue. As @metasoarous says, the idea of building a general interaction framework, and also porting all of ggplot/ggvis over is very bold! It would be nice to think we can get it done, but it might take a (very, possibly approaching infinitely) long time. I'd be happy to consider whether we can add some features to gorilla-plot to solve the problem directly as well. It wouldn't be as elegant or general, but that was never the point of gorilla-plot - it exists to just get the job done without too much effort!

I agree with the comments @ticking has on Vega. I'm also worried about the maintenance situation. I think it works well for the limited scope that gorilla-plot targets, and was a nice way to get some sort of "data-driven" plotting working quickly. But I'm not at all viewing it as the one-and-only future for plotting in Gorilla.

The idea of extending compose sounds nice in principle, but I think it would be pretty tough in practice. I suspect it would turn into an N^2 type of thing, where for N plotting libraries you'd have to write N^2 bits of code to get them all to work together!

metasoarous commented 10 years ago

@ticking - Sorry for not being clearer; you've nevertheless answered my questions :-) For one, I was just clarifying what I suspected: that marmoset can function as a separate library from plotting code which would want to take advantage of it.

Another thing I was trying to get a sense of is how difficult it would be to have a custom plot lib be compatible both with the standard plotting/rendering and that of marmoset, which you've also addressed. Do you think it would be possible to return Vega data wrapped in an :html typed Renderable, such that javascript/clojurescript operating within the iframe did the rendering internally? I imagine that would be the easiest and cleanest implementation which would enable toggling between VegaView and marmoset compatible output.

@ticking & @JonyEpsilon - That's unfortunate that there are some concerns with Vega. I think for now, since that's what the standard plotting core is producing that's likely what I'll stick with, but I'm curious about what you'd think about alternatives. Perhaps d3? Is this something you evaluated at all @JonyEpsilon?

I'm a bit strapped for time right now (getting married in two weeks :-) So it's unlikely I'll be able to spend too much time on this until things settle down. But I hope to start sketching out a project and throw something up before too long so we can keep the discussion rolling. I'll assuredly keep you all posted posted. I'm thinking of the name grivet :-)

JonyEpsilon commented 10 years ago

Regarding other options than Vega: I haven't come across anything that really stands out yet. I recently stumbled upon vis.js http://visjs.org which looks to have a very light API, mostly driven by data, and a wide range of plot types. I haven't had a chance to play with it though.

@metasoarous Good luck with the wedding ;-)

ghost commented 10 years ago

Hey, I thought about this some more. What we really should do is implement a bunch of grammar of graphics components for om.

These components will be super easy to integrate and will be far easier to get contributors for.

Thoughts?

JonyEpsilon commented 10 years ago

Truthfully, I've never quite "got" Om, but that's a statement of not having had time to look at closely, rather than any judgement on Om :-) So I'm not sure I'm in a position to contribute much of use. But that won't stop me writing a reply anyway! :-)

It would certainly be a nice thing to have a way to describe graphics from the ground-up in a GoG style. And I agree that an approach based on Om would likely be well-received and attract contributions well.

An useful thing, from the Gorilla side and probably more widely, would be for it be relatively self-contained. That is to say, being able to point it at a DOM element and give it some data to render, without having to know about its internals. And being able to call it from js.

I know @kovasb started work on a similar thing, but using the Mathematica graphics language as its inspiration (I think), powered by Om. https://github.com/kovasb/yantra . I don't know how far along he's gotten with it though. Might be worth getting in touch with him to compare notes.

kovasb commented 10 years ago

So I've done a bunch of experiments along these lines, mostly unpublished. Getting something basic working is pretty easy, getting something that is both powerful and useable -- a general interaction framework -- is a challenge.

What is the state policy? Are state changes in the UI persistent into the notebook file? If you are streaming updates into the UI, are those updates streamed into the notebook file? Are updates to the om app state somehow reflected into the server?
What about the initial load? If loading UI depends on server-side user code being initialized, this is a concern and a departure from the current model. Mathematica handles this is a somewhat clunky way that often results in the UI being broken until you manually initialize the necessary functions.
How are events in the UI processed back on the server? Do you need the equivalent of the lifecycle methods and diffing on the server side? How do you specify the communication to the server in your om components?

Of course one can do without many of these aspects and simply use om as a data-oriented renderer, which is what I have in yantra. But unless there is some interaction between the view generated by om, and the computations being built up server-side by the user, its a very weak form of interactivity.

Its also possible to avoid solving the general case, and just come up with something very component-specific.

ghost commented 10 years ago

@kovasb @JonyEpsilon Yeah state management for interactive renders is a bit tricky. I think it is helpful to properly distinguish between server side interactivity, which allows you to change values in your clojure environment by interacting with the repl renderings, and client only interactivity which will let you interact with the way a rendered value is represented but without consequences for your program.

To address this two implementations come to my mind.

A static default value burnt into the rendering code (in gorilla) or just a value (in session) for client only interactivity + an optional web socket connection for server side interactivity (only to be used sparingly).
1. nothing is persistent but the default value, server restart makes a proper reconnect impossible anyways
2. display the default value, visualise connection loss
3. channels, server could receive everything or almost nothing depending on the client code
This might be a bit better suited for gorilla because it has server side rendering anyways, so there is a place to setup the connection stuff. And it emphasises the use of a static value, with sparring use of volatile server interactivity when absolutely unavoidable.
A synchronised atom between the client and server.
1. everything is persistent, all changes get written into the notebook
2. display the initial/last value of the atom, visualise connection loss
3. the server is only informed of changes to the value
This might be a better option for something like session where everything is serialised and rendered in the client anyhow, so there is no way to properly setup things on the server. Session already supports client only interactive rendering because it uses om for rendering which can have local state, so the only thing new would be a updating rendering mode for atoms (or ref types for that matter :smile: ).

Both of these don't really require to complect the om rendering code (which gives client only interactivity) with the state management (which gives server side interactivity) thus a common grammar of graphics library based on om would probably benefit both approaches and both projects. So yantra might be a good basis for what @metasoarous described.

Thoughts?

kovasb commented 10 years ago

Couple of comments

A clojurescript take on ggplot2 that is agnostic about these server-side issues makes a lot of sense. One possible bottleneck though is the need to transfer the raw dataset to the client to facilitate the "statistical transformations". Maybe not a big deal.

Setting up the channels from server to client requires a nontrivial amount of bookkeeping and is hard to get right. This amounts to managing identities on both sides. Also: are you gonna be sending the entire new state, or updates to the state? Sending the entire new state down the wire gets unreasonable once dealing with nontrivial use cases.

A big concern of mine, that might not be relevant for more narrowly scoped applications, is composability. Its always possible to hack something together. But if you want an open ended system, or something like nested components, there needs to be a well-defined protocol or idiom to follow.

For instance: I want to page through a list of results, where each result is showing me a real-time updating plot of server metrics. I want to do something like

(PagingComponent (map PlotServer servers) :results-per-page 10)

and have the return value of that render into the appropriate interface and set up the appropriate plumbing on both sides.

ghost commented 10 years ago

In my experience the biggest lag is caused by transferring the needed cljs from the server to the client in gorilla (wrapped in a :html renderer).

The bookkeeping I encountered was pretty minimal. When a record that supports interactivity has called render on it, it will generate a UUID, and register a channel to receive incoming connections to that id. It will then generate the rendered html and embed the id together with the default value in it. Once it is rendered on the client side it will connect to the server with the provided id.

As for everything/changes/diff, this is pretty much up to rendering code in the value + channel approach. In the synchronised atom case one will probably want to do diffing.

The composability issues are pretty different in gorilla and session because of the way they handle rendering. Gorilla renders on the server, so all the setup and plumbing can be done there. Session on the other hand has to do the plumbing without an explicit render step in the server, rendering tagged literals in the client.

The value+channel stuff for gorilla is pretty composable so far, lager components like the paging component will call render on the subcomponents. Bubbling setup down the hierarchy. The resulting html will then be comprised of a lot of nested iframes, which each handle their setup. Using iframes is a bit ugly without proper seamless support, but works surprisingly well otherwise. A big downside is no code reuse between renders (a huge amount of code). The strict render sandboxing is also a double edged sword, it is nice that renders can't interfere, but everything has to be constantly updated from the server, not knowing what is visible and what not.

As far as the plumbing on session goes, it will probably be a lot harder to do. Like I said, I think having atoms to represent changes is the best bet but the implementation is beyond me so far.

JonyEpsilon commented 9 years ago

I think the only thing I can add is that I agree that general interactivity is going to be difficult!

My plan for the core was to add some very limited interactivity - undecided but thinking about either a "watch" function for atoms, or even simpler a "println" equivalent function that can output rendered material. I think I'd like to keep the core of Gorilla very simple and predictable, and this would cover many of the basic use cases for interactivity.

Of course, it would be great to see more sophisticated interactive approaches as libraries, and am happy to do whatever needs to be done to support that :-)

kovasb commented 9 years ago

I'm a little bit slow today, so maybe you can forgive me if I'm confused about the details of the proposal.

Is it possible for interactive components to communicate between each other? What about interactivity between separate inputs (for instance create a slider in one input, bind its value to a graphic in another input)
Is there a problem with resource leakage? I'm assuming the interactive components are backed by something stateful on the server-side, including potentially database connections. If there is a hierarchy of these things, and that hierarchy changes, this all needs to get cleaned up, both on client and server.
Is it possible to define the current view as a function of some (changing) data, or is the intention to just build up a protocol for communication and leave tying it all together as a user concern?

FWIW, in my experimental branch I had implemented an om-inspired construct on the server side that would do the whole walking/"rendering" (more generally, just computation returning values)/diffing/lifecycle dance, and then send those results to the appropriate position in the om tree client-side, which would communicate back to the components using channels like you did. It seemed to work reasonable well, but I was not satisfied with the user-facing api.

Of course I had to add some hooks on the server-side, but given that the client is already in Om it was quite easy to make the changes there.

ghost commented 9 years ago

Gorilla

Currently only through the server. It would be easy to implement message passing between them, but going through the server seemed cleaner.
Depends on wether you want to be able to reconnect or not. The resources on the client side will be released once the iframe disappears, but there will have to be some small bookkeeping done on the server to remember UUID/Channel pairs. Everything else is up to the person implementing the render function.
Yeah, all the library would do is supply some functions to make rendering cljs to html easier, as well as providing the basics for channel hookup.

Mind that the above really only works for gorilla. Especially because it only uses the existing rendering infrastructure and is easy to add on as a library.

Session

I thought a bit more about the session case and was wondering if one couldn't introduce the concept of a "live loop". Currently a loops input is just some raw text and its output is a tagged literal that gets rendered appropriately right? What if both were tagged literals? In case of editing clojure the input would be tagged as #session.clojure and be rendered as a flense component. This would mean one would have client only interactivity in input as well as output, as the input could also contain sub values like #session.slider or #session.color, that would be embedded into the code as literals. If these were also allowed as top-level forms, one might want to dispatch on their type in the server and call an appropriate eval, but I think having for example, properly rendered color literals in code, would be good enough for a start :). Live loops would then evaluate as soon as the input data changes, instead of waiting for explicit evaluation. The server could also push updates when needed to the output part. But all in all this would be minimally different from the current approach.

Read (custom rendered tagged literal in an om atom)
Eval (send atoms content to server and call eval)
Print (custom rendered tagged literal in an om atom) ;)

Note that this would separate the server interactivity into two unidirectional parts. This would also cleanly separate the client interactivity (which session already has) from the server interactivity.

rendered input literal --chan--> server --chan--> rendered output literal.
|client interactivity |  server interactivity   |  client interactivity  |

daslu commented 9 years ago

It might be relevant to read into this ongoing discussion of future developments in Vega: https://groups.google.com/forum/#!topic/vega-js/dp4seakxosY

Vega itself is planned to support declerative definitions of interactive plots.

JonyEpsilon commented 9 years ago

@daslu Thanks for the pointer, that is a very interesting discussion, and I will have to take a look at the 2.0 branch once I have some free time :-)

JonyEpsilon commented 9 years ago

So, not exactly what we were talking about, but I put together this, which work better than I thought it would:

https://github.com/JonyEpsilon/gg4clj

Demo here:

http://viewer.gorilla-repl.org/view.html?source=github&user=JonyEpsilon&repo=gg4clj&path=ws/demo.clj

JonyEpsilon commented 9 years ago

I think I'm going to close this one, as now JonyEpsilon/gg4clj exists, I'm seeing gorilla-plot as more of the quick-and-simple plot library.

JonyEpsilon / gorilla-repl

Ggplot style aesthetic mapping features? #167

Gorilla

Session