GiovineItalia / Gadfly.jl

Crafty statistical graphics for Julia.
http://gadflyjl.org/stable/
Other
1.9k stars 250 forks source link

Discussion: Subplots verus Facets #318

Open baptiste opened 10 years ago

baptiste commented 10 years ago

I really like the design of Gadfly, but this one thing I find really strange: why would faceting be described by a Geom? I find it much clearer as a fully-independent concept (of scales, stats, guides, geoms), as in the ggplot2 idiom.
I could understand if it was an aesthetic, in the sense that it describes a way to split the data into groups. But Geoms, to me, are a completely orthogonal notion.

Is it a design choice driven by ease of implementation, or is there a deeper reason that I'm missing?

(I hope I'm not abusing the Issue report system, I thought this discussion might be useful).

dcjones commented 10 years ago

It's fine to ask questions here. There's not yet another Gadfly specific venue to do so, which is maybe something that should change.

My choice to implement facets as subplots was actually influenced Hadley Wickham rethinking his own design, prompted by a paper his student wrote.

It's a little alien coming from ggplot2, but I think the subplot notion is an elegant generalization of facets. Geometries are just functions that take data and draw something on a particular coordinate system. I think subplot_grid fits into that notion: the coordinate system is at m-by-n grid, and what it draws is smaller plots.

The notion of subplots can be used to do other fancier things, which the "embedded plots" paper discusses. I hope to implement some of that in Gadfly. Having both facets and subplots would have ended up confusing, since they do similar things, hence I attempted to unify the notion.

baptiste commented 10 years ago

Thanks – certainly something to think about, I hadn't followed very closely that particular ggplot2 discussion. I like the concept of subplots, seeing any plot as a particular type of glyph, and the recursive relation between plots and glyphs. In fact I always thought it should be generalised to other plot elements, such as axes or strips being a custom function of the data (lattice always offered this possibility, but without proper grammar). However I'm not really sure I'm ready to embrace the idea of seeing facets as a special case of subplots, because those subplots are not positioned within a "real" data space anymore, but mapped to an abstract, ad hoc tabular space defined by the page layout. Maybe it makes sense, after all the layout of facets can be (loosely) specified by the groups of data -- facet sizes may even scale with the data --, I need to think more about it.

dcjones commented 10 years ago

With Geom.subplot_grid the data space is as real as any other categorical data, but that's completely valid point when it comes to something like ggplot2's facet_wrap, which I've yet to implement the equivalent of. So far I like the idea of subplots, but I'm always open to reevaluating that position if ends up being painful for everyone.

Also, I agree completely about plot elements like axes, etc, being first-class members of the grammar. I've tried to do that with tho notion of guides in Gadfly (Guide.xlabel, Guide.xticks, etc), which work almost identically to geometries but with some special layout concerns.

I'm renaming this issue and keeping it open so others can weigh in if they'd like.

cpsievert commented 10 years ago

Just so others know, Hadley has an updated version of that embedded plots paper here.

The power of embedded plots seems to come from their two-tiered structure (subplots embedded inside a "higher-level" plot). To quote the paper: "The axes of the subplot do not have to be the same as the axes that the subplot is positioned on. In fact, the subplot can use an entirely different coordinate system than the higher level plot."

I am new to Gadfly and have yet to see any examples that truly exploits this idea (beyond the facet_grid layout as in ggplot2). Are there any such examples? If this isn't yet possible in Gadfly, what do you see as the main hurdles to its implementation?

dcjones commented 10 years ago

Thanks for the pointer @cpsievert.

You're right that the subplot idea isn't yet exploited in Gadfly beyond faceting, but Gadfly is far from finished and I do intend to push more in this direction. Specifically I want to add Geom.subplot_wrap, which will be the equivalent of facet_wrap, and Geom.subplot that will do the sort of positioned subplots described in the paper.

There's really no major hurdle to implementation. Most of the tricky parts of subplots have been worked out with Geom.subplot_grid. It's just a matter of me finding enough time to work on it, or some other brave soul stepping up.

cpsievert commented 10 years ago

Great, thanks! Having a facet_wrap equivalent would be nice, but given the current state of ggsubplot, I think Geom.subplot would be of greater value. I look forward to following developments of (and possibly contributing to) Gadfly!