MakieOrg / AlgebraOfGraphics.jl

An algebraic spin on grammar-of-graphics data visualization in Julia. Powered by the Makie.jl plotting ecosystem.
https://aog.makie.org
MIT License
443 stars 45 forks source link

Rename style and spec #74

Closed jkrumbiegel closed 4 years ago

jkrumbiegel commented 4 years ago

I still get confused by the meaning of these. To me that suggests a better naming might be needed.

Style sounds to me like what spec is doing and spec sounds more like what style is doing. Both are not very clear about their use.

Style is explained as "the mapping from data to plot" so I'd suggest renaming it to mapping.

spec controls any other settings a grouping might have, so maybe settings would be good

piever commented 4 years ago

I see how that can be confusing, and I guess it doesn't help that neither has docstrings...

Style comes from the Grammar of Graphics idea that you "style the plot with data". They use the word "aesthetics", but somehow that feels a bit too much. I see your point that it is a mapping or a binding. AFAIU, in Vue.js or React.js this is called binding (like v-bind), whereas it is called viewmodel in "Model-View-ViewModel" architectures like Knockout.js.

I kind of like the name style, but I'm not very fond of spec as a public API. Maybe theme would be better, esp. since you can use it to pass a theme specific to a plot, things like

style(:SepalLength, :SepalWidth, color = :Species) * theme(Scatter, color = [:red, :blue, :green])

style is not completely accurate as a name because you can do for example style(x, y, wts = z) * linear and wts represents the "data-based" option given to the analysis, whereas "non-data" options given to the analysis would be passed to linear directly. For example

style(x, y, wts = z) * linear(interval = :confidence)

which IMO is a nice way to split keywords. By this logic, one should do style(x, y) * scatter(marker = :cross) (or Scatter(marker = :cross)), and be rid of spec altogether.

So possible ways forward could be:

  1. mapping and theme,
  2. mapping and settings (or other synonyms, like options),
  3. style and theme,
  4. style and settings (or other synonyms, like options),
  5. style remains the same, but we uniform the old spec with analyses somehow (not 100% sure how).
piever commented 4 years ago

To clarify the last points, Analysis is essentially a "partially evaluated function", where some keyword arguments can be set by the user independently of the data. Somehow, I wish we could do the same for the plotting part.

jkrumbiegel commented 4 years ago

Aha so spec is plot plus data independent settings, then settings is not so good of a name. Maybe layer?

What is the current way of mapping keywords under style to the expected content? Is something like color or strokecolor hard-coded to refer to colors or is that done via dispatch? Is the current system flexible enough to put basically arbitrary plot functions in a spec / layer?

piever commented 4 years ago

I like layer! I've just realized that I use internally a function that returns a list of Specs, and I had called it layers... The idea would be that every addend has an implicit layer(Any) (use the default plot for the given arguments), which can be customized.

Pretty much nothing is hardcoded (and I would like as much as possible to keep it that way!). The mechanism is as follows:

So far the first argument of spec is assumed to be some T for which AbstractPlotting.plot!(::Scene, T, ::Attributes, args...) is defined, so any plotting type should work, and any new @recipe should work automatically. In principle, one could also allow functions (it's very easy to add that, and probably a good idea), provided they have a method f(scene, args...; attributes...) or something like that. Do you imagine doing something like:

layer() do scene, attributes, args... # I wonder if one can use keyword arguments in a do block
    # some custom plot
end

?

jkrumbiegel commented 4 years ago

What I was wondering about is this: you have a plot function which takes the keyword color, and you can pass a single color, an array of floats or an array of colors to that. Now you want to use this with AlgebraOfGraphics. You want to use the keyword color in the style / mapping call and feed it with some column from a data frame which has an arbitrary element type. So how does AlgebraOfGraphics know how to transform that input data. Is it hard-coded that color keyword means a transformation to color values is necessary? Does AlgebraOfGraphics create vectors of colors using internal palettes and passes those on directly, or does it pass on raw data and a color map with adequate ranges? What if I have a plotting function that takes the keyword hue, how can I tell AlgebraOfGraphics that colors are needed for this keyword? I'm imagining a pipeline where you can dispatch on keyword symbols to override transformation targets per keyword and plotting function, with sensible defaults like :color => color transformation

piever commented 4 years ago

Ah I see. Going from arbitrary elements in your data to "plottable" things is done with scales (the interface is not fully chrystalized yet). For now AlgebraOfGraphics looks for an appropriate scale in AbstractPlotting.current_theme()[:palette] (which has some of these scales). You can pass a specific scale to spec. For example style(:x, :y, hue = :z) * spec(MyPlot, hue = [:red, :blue, :green]) would work. If you want to tell it you want colors, you'd do

theme(c) = AbstractPlotting.current_theme()[:palette][c]
spec(MyPlot, hue = theme(:color))

The scales need to be cleaned up at some point, probably getting a custom type rather than just lists, with support for continuous scales, and easier to customize themes. The clean interface is apply_scale(scale, value) (if a scale is in the theme, or provided explicitly), or just pass value along otherwise. (Probably we should discuss the "scales" in a separate issue though.)

Last thing: concerning naming, I forgot to mention that our spec / layer is called geometry in standard Grammar Of Graphics, but I'm not sure if we want to call it that.

jkrumbiegel commented 4 years ago

In grammar of graphics, every geom has a statistic and every statistic has a geom, so it would have made sense to call them layers as well ;)

jkrumbiegel commented 4 years ago

And I thought it would work like this (we can just make a separate issue though):

mapping(:x, :y,
    some_color_attr = :some_data => colorscale(log, palette = :viridis)) * layer(...
piever commented 4 years ago

I use the some_attr = :col => f to do transformations on the fly (like making the variable categorical, or cutting it into discrete bins), and perhaps even to give a new name to the column (see #69). In the original design (see here) it was the way you suggested. I've opened #75 to discuss it.

piever commented 4 years ago

And I thought it would work like this (we can just make a separate issue though):

mapping(:x, :y,
    some_color_attr = :some_data => colorscale(log, palette = :viridis)) * layer(...

Related question: do we have log colorscales in Makie? Do you have some example code I could play with to see how that would work? I'm actually not entirely sure what to expect from a log colorscale (in terms of ticks on the colorbar, how the color varies and so on).

jfb-h commented 4 years ago

just to chime in on this from a user's perspective: I like mapping + layer, that would make it very clear IMO. Love the package btw, something like this has really been missing in the Julia plotting ecosystem!

jkrumbiegel commented 4 years ago

Oh forgot to respond to that last comment. Log colorscale just means that the data values are exponential and the linear color gradient maps to that exponential data. So the colorbar looks the same as usual, just with tick marks like 10, 100, 1000

You can of course achieve the same colors with a linear colorscale and log of the data first, but then your tick marks might be less intuitive if you're not used to exponents

jkrumbiegel commented 4 years ago

And we don't have that yet as the colorrange / colormap combination only has a min/max value combination, not a transform function

kleinschmidt commented 4 years ago

EDIT: I am confused, disregard this for now...

On Slack we discussed this a bit, and @piever suggested using bind for what is now spec (link styles to data), which I think is great: it captures that you're creating a mapping between the data and some attribute, it's easy to remember, short, pretty specific, and intuitive (at least to me).

piever commented 4 years ago

So, ideas spitballed in slacks were:

Instead of the current spec, that essentially holds a plot type and a dict of attributes, it would be interesting to see if AbstractPlotting can support that (basically support a nice way for the user to pass plot type and some attributes).

jkrumbiegel commented 4 years ago

I think bind is a good choice for the data part.

For me, theme sounds more restricted to some attributes that theme plot objects which are specified elsewhere. I don't think Scatter would really be part of the theme, it is a visualization of a binding / mapping that can then be themed. Maybe another way to describe this part of the plot could therefore be visual?

What I don't understand, yet, is how the current approach can support all possible plotting functions, no matter what their arguments and attributes are. Basically, we have arguments that are set based on grouped data (right now that's style) and arguments that are set manually (that's spec). Both of them can control visual aspects of the plot or statistical aspects, it's really the source of the attributes that's different. And I'm not sure if the current pipeline is generic enough to feed it any kind of plotting function. It feels like there are hard coded assumptions, for example that color is passed to a color keyword, and that colorbars should look for that. Instead, I think we should think about base behaviors that we want, and how we can communicate that to the API. So any arbitrarily named positional or keyword argument should be able to be told "you're a continuous color related attribute" and create a colorbar

pdimens commented 4 years ago

To add another voice to this discussion, the combination of bind (bind data to plot attributes) and style (style everything else) are IMO the most intuitive and succinct terms to describe the two. I think style is flexible enough to allow the static definition of plot components, but also specify the style of plot you want (e.g. a scatterplot vs barplot). If you think of any plot type as one of several options used to visualize data, then it seems fair to say you've chosen to style your visualization as a _____plot.

piever commented 4 years ago

So, from what I understand there is consensus about current style becoming bind.

For me, theme sounds more restricted to some attributes that theme plot objects which are specified elsewhere. I don't think Scatter would really be part of the theme, it is a visualization of a binding / mapping that can then be themed. Maybe another way to describe this part of the plot could therefore be visual?

Agreed! I like visual as in "the visual part of the specification, that does not depend on data.

What I don't understand, yet, is how the current approach can support all possible plotting functions, no matter what their arguments and attributes are. Basically, we have arguments that are set based on grouped data (right now that's style) and arguments that are set manually (that's spec). Both of them can control visual aspects of the plot or statistical aspects, it's really the source of the attributes that's different. And I'm not sure if the current pipeline is generic enough to feed it any kind of plotting function. It feels like there are hard coded assumptions, for example that color is passed to a color keyword, and that colorbars should look for that. Instead, I think we should think about base behaviors that we want, and how we can communicate that to the API. So any arbitrarily named positional or keyword argument should be able to be told "you're a continuous color related attribute" and create a colorbar

This will be fixed by the "scale" rework. In particular, one should allow bind(my_attribute = :some_variable => default_color_scale), that is, bind would have some hard-coded scales (as it does now), but would also accept default ones. For default scales, one has to hard-code, but if some plot recipes uses their own keyword, the user would just have to pass the appropriate scale. We can discuss in more detail as soon as I manage to work on the scales refactor.

piever commented 4 years ago

I hadn't noticed that before, but bind is already exported by Base, so not really an option, I guess we should go for mapping.

jkrumbiegel commented 4 years ago

Mapping sounds good! Are you done with your phd? If so, congratulations :)

piever commented 4 years ago

Yes, finally! Thanks :)