Open baptiste opened 10 years ago
It's fine to ask questions here. There's not yet another Gadfly specific venue to do so, which is maybe something that should change.
My choice to implement facets as subplots was actually influenced Hadley Wickham rethinking his own design, prompted by a paper his student wrote.
It's a little alien coming from ggplot2, but I think the subplot notion is an elegant generalization of facets. Geometries are just functions that take data and draw something on a particular coordinate system. I think subplot_grid
fits into that notion: the coordinate system is at m-by-n grid, and what it draws is smaller plots.
The notion of subplots can be used to do other fancier things, which the "embedded plots" paper discusses. I hope to implement some of that in Gadfly. Having both facets and subplots would have ended up confusing, since they do similar things, hence I attempted to unify the notion.
Thanks – certainly something to think about, I hadn't followed very closely that particular ggplot2 discussion. I like the concept of subplots, seeing any plot as a particular type of glyph, and the recursive relation between plots and glyphs. In fact I always thought it should be generalised to other plot elements, such as axes or strips being a custom function of the data (lattice always offered this possibility, but without proper grammar). However I'm not really sure I'm ready to embrace the idea of seeing facets as a special case of subplots, because those subplots are not positioned within a "real" data space anymore, but mapped to an abstract, ad hoc tabular space defined by the page layout. Maybe it makes sense, after all the layout of facets can be (loosely) specified by the groups of data -- facet sizes may even scale with the data --, I need to think more about it.
With Geom.subplot_grid
the data space is as real as any other categorical data, but that's completely valid point when it comes to something like ggplot2's facet_wrap
, which I've yet to implement the equivalent of. So far I like the idea of subplots, but I'm always open to reevaluating that position if ends up being painful for everyone.
Also, I agree completely about plot elements like axes, etc, being first-class members of the grammar. I've tried to do that with tho notion of guides in Gadfly (Guide.xlabel
, Guide.xticks
, etc), which work almost identically to geometries but with some special layout concerns.
I'm renaming this issue and keeping it open so others can weigh in if they'd like.
Just so others know, Hadley has an updated version of that embedded plots paper here.
The power of embedded plots seems to come from their two-tiered structure (subplots embedded inside a "higher-level" plot). To quote the paper: "The axes of the subplot do not have to be the same as the axes that the subplot is positioned on. In fact, the subplot can use an entirely different coordinate system than the higher level plot."
I am new to Gadfly and have yet to see any examples that truly exploits this idea (beyond the facet_grid layout as in ggplot2). Are there any such examples? If this isn't yet possible in Gadfly, what do you see as the main hurdles to its implementation?
Thanks for the pointer @cpsievert.
You're right that the subplot idea isn't yet exploited in Gadfly beyond faceting, but Gadfly is far from finished and I do intend to push more in this direction. Specifically I want to add Geom.subplot_wrap
, which will be the equivalent of facet_wrap, and Geom.subplot
that will do the sort of positioned subplots described in the paper.
There's really no major hurdle to implementation. Most of the tricky parts of subplots have been worked out with Geom.subplot_grid
. It's just a matter of me finding enough time to work on it, or some other brave soul stepping up.
I really like the design of Gadfly, but this one thing I find really strange: why would faceting be described by a Geom? I find it much clearer as a fully-independent concept (of scales, stats, guides, geoms), as in the ggplot2 idiom.
I could understand if it was an aesthetic, in the sense that it describes a way to split the data into groups. But Geoms, to me, are a completely orthogonal notion.
Is it a design choice driven by ease of implementation, or is there a deeper reason that I'm missing?
(I hope I'm not abusing the Issue report system, I thought this discussion might be useful).