Closed kanitw closed 4 years ago
We might not need to extend Vega for this.
Keep in mind that Vega-Lite is a visualization grammar, we can't add pie support without considering its implication for the underlying coordinate system and other visualizations that could be coherently expressed within the grammar .
We have not spent much time on because pie chart and other polar coordinate based charts generally leads to perceptual issues. In fact, visualization experts generally favor 1D-stacked bar chart to pie charts. So while they might be useful in some cases, we plan to work on other more important features first.
I'll thumbs up the desire for this, with my preference being for something like "mark":"pie"
or "mark": "donut"
. I really appreciate the ease of c3 with regards to this kind of transformation, makes it easy to go from stack bar to pie/donut.
Having to deal with directional data, I need to plot stuff like this http://gisworld.info/polar-plots-and-circular-statistics-in-arcgis/ For that, would polar coordinates support be required, or can it be done already?
If it cannot, I would recommend changing again the title of the issue (and perhaps reconsider the priority?), because I agree that pie charts are terrible, but for directional statistics polar coordinate support is really necessary :)
HI @mredaelli. Yes, you'd need polar coordinates and you have to use Vega for now. Unfortunately, we won't add polar coordinates anytime soon and if there is enough need for it, we hope that someone from the community takes on building this feature. We are happy to support anyone who wants to add polar coordinates but don't have the resources ourselves.
Just in case someone want to help implement, I think there are two options going forward for this issue:
The benefit for this approach is that it will allow easy transition between stacked bar and pie chart with just operation (just adding a polar coordinate) and this will allow support for many other marks. However, this will require quite a substantial work in the underlying Vega to make axis and other marks (e.g., line, area) all work for polar coordinate.
arc
like Vega currently uses.Although this method won't support all possible plots in polar coordinate, this will require quite way less work -- and probably won't require modifying Vega. Plus, I think it will make spec for pie chart and polar plots a bit more readable. Basically, directly having encoding channel tuples (radius, angle)
is clearer than requiring users to know that (x,y)
is basically (r,t)
/ (radius, angle)
if polar coordinate is specified.
Thus, IMHO, this might be the way going forward.
Looking at Vega's pie chart example, this can be implemented by adding the following things to Vega-Lite:
1) New arc
mark
2) New angle
and angle2
encoding channel -- only for the arc mark. (for startAngle
and endAngle
-- but I kinda feel like we should provide a scheme that's consistent with x
and x2
.)
a) If a field is mapped to angle
when there is no field mapped to angle2
, this basically represent a pie/donut, there are roughly two options:
i) This should automatically populate a pie transform. The encoded field should parameterize the field
property of the underlying pie transform and automatically map the output fields from the pie transform to startAngle
and endAngle
encoding channels in the output Vega spec -- just like how x
and y
deals with stack stack
. For this option, I'm not sure how should sort
be based handled since the sort
property in the pie transform is just a boolean. Perhaps, we should have some smart logic that uses collect transform if the order
encoding channel in VL is specified.
stack
property for disable stacking. I personally think this option is better since it still uses a linear scale by default. Plus, I believe that our logic for sorting the stack would still work. That said, I think someone should prototype if a pie chart can be created in this way in Vega first. b) If both angle
and angle2
are provided, we don't need to automatically include pie/stack transform. But we need to determine what's the default scale for this case. (It's not very clear yet.)
3) new Radius
and Radius2
encoding channels (or innerRadius
and outerRadius
-- but I kinda feel like we should provide a scheme that's consistent with x
and x2
.) We might not need this for the first MVP PR. (Just making radius
and radius2
mark definition can be sufficient for supporting donut and pie.) But this is needed for a nightingale rose diagram.
4) New padAngle
, cornerRadius
mark [properties]((https://vega.github.io/vega-lite/docs/mark.html#mark-def) and config.
For each of these channel, we should experiment if we correctly make them work for discrete / continuous scales.
cc: @jheer -- feel free to comment if you have any thoughts :)
Love 'em or hate 'em, pies and doughnuts are high on the list of things my users ask me for. All the time. Have to support them to get wide adoption of you framework. Ditto transitions btw.
Transitions are on the roadmap (https://github.com/vega/vega/issues/641). Pie charts require careful design and we will probably rely on external contributions if somebody needs this.
Right now we're focusing more on statistical graphics and annotations, which are way more (uncontroversially) important for adoption in data science/analysis activities.
I agree that people ask for pie and donut all the time, but that doesn't mean that they should always use them. (Yes, there are cases that they are good, but I don't think this is a high priority task for us at the moment. -- We probably do it at some point, but it's not the main priority right now.)
There is another approach to consider which doesn't involve adding any new grammar and already works in VL 2.3.
Since we now have spherical geographical referencing and map projection onto a plane, we get polar coordinates for free if you represent polar coordinates as longitude/latitude and then project with an azimuthalEquidistant
projection rotated to one of the poles.
I used this technique to create the following all in Vega-Lite:
I generated the code programmatically with elm-vega which allows me to input a simple list of value-category pairs such as the following:
data =
toPolar
[ ( "robin", 4 )
, ( "sparrow", 6 )
, ( "wren", 10 )
, ( "blue tit", 3 )
, ( "starling", 7 )
, ( "chaffinch", 8 )
]
to generate the necessary longitude/latutude coordinates. So while the resulting Vega-Lite spec below would be a little tedious to write by hand, it should be possible to create a macro to generate the spec as you might for compound statistical graphics.
{
"$schema": "https://vega.github.io/schema/vega-lite/v2.json",
"config": { "view": { "stroke": "" } },
"width": 200, "height": 200,
"projection": {
"type": "azimuthalEquidistant", "rotate": [ 0, 90, 0 ]
},
"data": {
"values": {
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[[0,90],[37.89,-89],[0,-89],[0,90]]]
},
"properties": {"cat": "robin"}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[[0,90],[94.74,-89],[37.89,-89],[0,90]]]
},
"properties": {"cat": "sparrow"}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[[0,90],[189.47,-89],[94.74,-89],[0,90]]]
},
"properties": {"cat": "wren"}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[[0,90],[217.89,-89],[189.47,-89],[0,90]]]
},
"properties": {"cat": "blue tit"}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[[0,90],[284.215,-89],[217.89,-89],[0,90]]]
},
"properties": {"cat": "starling"}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [[[0,90],[360,-89],[284.21,-89],[0,90]]]
},
"properties": {"cat": "chaffinch"}
}
]
},
"format": {"type": "json","property": "features"}
},
"mark": {"type": "geoshape","stroke": "#fff"},
"encoding": {
"color": {
"field": "properties.cat",
"type": "nominal",
"legend": {"title": null}
}
}
}
One consequence of this approach is that it is trivial to present the same data in a stacked Cartesian plot simply by changing the projection to Equirectangular
:
Once you create pie charts, donut charts can be created simply by layering a circle of the background colour:
And the transformation to polar coordinates allows other polar charts to be created such as 'Nightingale' rose diagrams, here shown with a 'graticule' to provide the radial axes.
That's genius and surprised me as a creator of Vega-Lite.
This is amazing :)
I am impressed. Time for some new Altair examples :smile:
Time for some new Altair examples š
This makes me a bit nervous. While the above definitely works, you won't get nice axes or labels. Let's add a warning about this if you want to add these examples. @kanitw we should decide whether we want the pie chart as a test example.
Fair enough, I can hold off. This just looks so cool š
you won't get nice axes or labels
You actually get nice grid line (without labels) with this projection approach while the normal arc approach wouldn't. (There is no polar axis support in Vega yet)
That said, I still prefer the arc approach as it will be way easier to comprehend to maintain / reuse the output Vega specs.
(Still the examples are amazing!)
This is awesome! But still a hack :) I wouldn't want to explain to the users how to generate these geodata datasets from their data - imagine someone comes to you and says they have a table data, and they want to draw it as a pie chart - using standard transformations.
Yes (opaque hack), it was never my intention to encourage people to encode their specs directly in this way. Initially I tried it as a bit of a geeky joke, but then became quite taken with the results. I do think it demonstrates some important points though:
If people really need to generate visualisation with polar coords, it is at least technically possible with the current version. In my case I was trying to see if I could reproduce a design I originally wrote with Processing. Personally I am not particularly bothered by the ability to create pie charts, but there are some interesting radial designs that could be more useful.
For environments that use higher level languages/interfaces (altair, elm-vega etc.), it allows the creation of an accessible way of generating radial charts of various kinds. I would put them in the same class as Tukey boxplots etc ā using Vega-Lite grammar under the hood to create higher level compound graphics.
Polar coordinates could be treated in Vega-Lite just like map projections of geo coordinates without making significant changes to the grammar. Of course geo coordinates are polar coordinates, just in 3d, which become 2d polar when projected with an azimuthal projection centred at a pole.
I have an interest in the continuum between geospatial and non-geospatial visualization and the spaces that can be created along this continuum so I think it is a useful exercise to push VL to see what can be achieved in those spaces.
@jwoLondon I agree with you :) It is an interesting approach, and has a number of potentials.. My concern was mostly with geojson portion, because in a sense, you can think of it as using Vega-Lite as a fancy SVG drawing lib. Some data provider does all the heavy lifting of converting data to the visualization's visual aspect (as oppose to outputting just the pure "raw" data) and outputs it as geojson. VL would than simply draw it, without much thought. In a way, this is similar to using Vega's "image" mark - you get the data semi-prepared. The main drawback of this is that Vega spec becomes inseparable from the data generation code on the backend, and that data generation is tightly coupled with the Vega's vis. In a typical Vega scenario, the backend doesn't need to be aware of how the data will be used.
ii) This can instead populate an equivalent "normalize" stack transform and uses a linear scale that has range from [0, 360]. The channel can have a stack property for disable stacking. I personally think this option is better since it still uses a linear scale by default. Plus, I believe that our logic for sorting the stack would still work. That said, I think someone should prototype if a pie chart can be created in this way in Vega first.
Here is an example Vega spec for pie chart using stack
transform and a linear scale. I think this approach would work better for Vega-Lite as we can reapply the scale for "angle"
for both the case that only angle
is defined (automatically generate stack for pie) and the case that both angle
and angle2
are defined.
{
"$schema": "https://vega.github.io/schema/vega/v3.0.json",
"autosize": "pad",
"padding": 5,
"width": 200,
"height": 21,
"style": "cell",
"data": [
{
"name": "source_0",
"url": "data/barley.json",
"format": {
"type": "json",
"parse": {
"yield": "number"
}
},
"transform": [
{
"type": "aggregate",
"groupby": [
"site"
],
"ops": [
"sum"
],
"fields": [
"yield"
],
"as": [
"sum_yield"
]
},
{
"type": "stack",
"groupby": [],
"field": "sum_yield",
"sort": {
"field": [
"site"
],
"order": [
"descending"
]
},
"as": [
"sum_yield_start",
"sum_yield_end"
],
"offset": "zero"
}
]
}
],
"marks": [
{
"name": "marks",
"type": "arc",
"style": [
"bar"
],
"from": {
"data": "source_0"
},
"encode": {
"update": {
"fill": {
"scale": "color",
"field": "site"
},
"x": {
"signal": "width/2"
},
"y": {
"signal": "height/2"
},
"innerRadius": {"signal": "0"},
"outerRadius": {"signal": "width / 2"},
"startAngle": {
"scale": "angle",
"field": "sum_yield_end"
},
"endAngle": {
"scale": "angle",
"field": "sum_yield_start"
}
}
}
}
],
"scales": [
{
"name": "angle",
"type": "linear",
"domain": {
"data": "source_0",
"fields": [
"sum_yield_start",
"sum_yield_end"
]
},
"range": [
0,
{"signal": "2*PI"}
],
"nice": false,
"zero": true
},
{
"name": "color",
"type": "ordinal",
"domain": {
"data": "source_0",
"field": "site",
"sort": true
},
"range": "category"
}
],
"legends": [
{
"fill": "color",
"title": "site",
"encode": {
"symbols": {
"update": {
"shape": {
"value": "square"
}
}
}
}
}
],
"config": {
"axisY": {
"minExtent": 30
}
}
}
General support for polar visualizations would be useful, something without too much typing required :) If you did add support for pie charts, my vote would be to not bother unless you also support polar area charts, which are much more useful in practice.
polar area charts
Arc marks can be use to compose radial plots / polar area without the need for a new coordinate system.
However, this won't work with radar chart, etc.
(Copied from slack as @iliatimofeev asks for this feature)
For selection, I actually think that the single and multi selection would come for free (as there is nothing fundamentally different about them). The interval selection is the only bit thatās tricky. However, just like applying interval selection over stacked-bar isnāt meaningful, applying interval selection over polar coordinate isnāt meaningful either. Thus, it shouldnāt be supported.
The question is whether the rest of the team are okay with supporting arc mark at all. If they are okay with that the implementation should be quite straightforward (although it will take some amount of effort).
Iām personally ok if we gonna have this, but I also think that itās quite lower priority given limited resource we have. So even if the team agree with having this, I donāt plan to do this soon.
So while the resulting Vega-Lite spec below would be a little tedious to write by hand, it should be possible to create a macro to generate the spec as you might for compound statistical graphics.
Would this be implementable via a dedicated vega-lite transform?
The question is whether the rest of the team are okay with supporting arc mark at all.
As jwoLondon appears to already have "tested" the polar coordinate approach, my guess is that implementing arc can have its own advantages.
As suggested above, this can be done by adding arc mark with angle channel without an additional transform.
Using projection to generate pie chart is an awesome hack but it shouldn't be the official solution if we were to support this at all.
@jwoLondon I would love to see the spec for your "Nightingale" chart. I think it would benefit others.
I love these comments and the cleverness you've shown in creating pies. And I do need pies for my own work (I'm not a mindless hater of pies). I would hope, however, that you implement polar charts as a coordinate system like the one in the book. The examples by @jwoLondon show how using coordinates can change the shape of a divided bar into a pie. But using shortcuts like an "arc" mark substantially deviate from the grammar and lead to some snarky places. What I love about vega-lite is that it is a serious attempt to implement the grammar and avoids the kind of shortcuts that every other library seems to take in order to generate the charts people love. Yes, polar coordinates do require some serious work for axes because all the ticks and labels are different (I've felt that pain when programming them for the book). Maybe for now, though, you could make polar coordinates work for pie and leave axes on your todo list.
@lelandwilkinson -- Thanks for your comment. It's very encouraging to hear a praise about Vega-Lite from you. :)
Reading your comment, I'm curious why you think "arc" could lead to snarky places (what are example snarky places you're thinking about).
I actually think arc marks with explicit "radius" and "angle" channels (separated from "x" / "y") have some benefits.
Given that Vega and Vega-Lite has explicit encoding channels in the languages, I think Vega-Lite should not use (x,y)
as channels to describe coordinates for polar coordinates.
Though arcs/wedges are āequivalentā to converting rects with (x, y)
positions into polar coordinateās (angle, radius)
positions, people rarely think of angle as x and radius as y. Even mathematicians would use (r,t) or (Ļ,Īø) for polar coordinates, not (x,y). Thus, naming the channels as (angle, radius) would match most peopleās mental model better.
Note that the original GoG has no such problem since the position(a*b)
method can infer that a is either x or angle, and b is either y or radius based on the coordinate. This flexibility is a very elegant aspect of the original GoG, IMHO.
But given we have chosen a more explicit path, I think it's important that we name them to match the mental model of users.
Itās also debatable that whether using ābarsā to represent arcs like in ggplot2 is a good idea. While arcs/wedges are āwarpedā bars, they are not literally ābarsā, so calling them bar
like in ggplot would introduce a mental mismatch for users and make the code in the language hard to read.
(The term āintervalā in the original GoG is less problematic and flexible for either coordinate in this regard. But given we already have "bar", renaming it at this point is too late.)
In practice, people may want to mix polar and Cartesian coordinates.
Thus, distinguishing between warped marks (e.g., arcs) and marks that wonāt get warped (bars, texts) as well as having some flexibility to combine them could be useful.
Note that I know that some of these graphics can be problematic, but there are some cases that they are preferable too.
Interesting discussion. Thanks for reopening. I hope the comments below aren't off-topic, but I think they represent one approach to mixing coordinate systems.
I do wonder if most of the issues you raise under 'comparability between polar and Cartesian coordinates' could be addressed with data-driven view composition. This is something I've wanted to see in a visualization specification for a long time (we considered this with HIVE a few years ago, but the language was rather too abstract to be of practical benefit in most situations). In Vega-Lite, data-driven view composition would offer significant creative possibilities and covers most of your examples above.
So in addition to composing views as hierarchies on a grid (hconcat
, vconcat
etc.) you allow a sub-view to be positioned by some characteristic of a parent's position. The most obvious might be its centroid which can therefore be independent of coordinate system. This would would be particularly useful for mapping, like your pie map example, where some arbitrary plot is specified for each spatial unit. It would also solve your pie labelling problem without having to worry about the specification explicitly mixing polar and Cartesian coordinate systems.
@lelandwilkinson -- Thanks for your comment. It's very encouraging to hear a praise about Vega-Lite from you. :)
Reading your comment, I'm curious why you think "arc" could lead to snarky places (what are example snarky places you're thinking about).
I actually think arc marks with explicit "radius" and "angle" channels (separated from "x" / "y") have some benefits.
The problem I see with that is that eventually you'd have to do a separate set of parameters for each coordinate system. How about spherical coordinates, for example? Would you introduce rho and theta with a special spherical triangle or hexagon tile mark? The way I viewed it in GG was to make x and y and z to be "roles" (what vega-lite is doing with channels) that were assigned to parameters in coordinate systems. Then marks applied to every coordinate system (squares on the surface of the globe, lines as great circle routes, etc.) It made things easy for adding new coordinate systems, as we did in nViZn. Seems much simpler to me. I always tried to think of the most general case, even if it didn't exist yet in code. Yes, there are agile design principles that go against trying to be general when you start, but that puts you in a constant refactoring situation, it seems to me.
Just so you know, my favorite chapter in GG is Coordinates. I tried to show the statistical community that many of the things they did for data transformations had direct analogues in graphics coordinate systems. Figure 9.15 showed that the weighted least squares estimator could be implemented in a coordinate transformation that showed what was being done to the x,y channels (to use the vega-lite term). Or Figure 9.17, which showed that the most appropriate transformation for some weather data was onto the complex plane. Now, I know these are rather esoteric examples, but when coordinate systems are treated as general functions that map (x,y,z) channels to real or complex numbers, it opens up an enormous range of possibilities. For the first time, one has a graphics system that can draw meaningful graphics that nobody has seen before. That was one of the motivations behind writing GG in the first place. If vega-lite did it that way, outside contributors could write coordinate functions that did all sorts of new graphics without changing a line of code elsewhere. And as one of you pointed out, changing from divided bar to pie would involve only one JSON tag. I hope you don't mind my complicating your life like this, but having gone through a lot of these GG-inspired systems, I believe yours is the most authentic implementation. I'm using it every day. Thanks for all the great work you've done.
@jwoLondon @lelandwilkinson Thank you so much for your comments.
We're having very fruitful discussion here and I'm learning a lot from this. :)
So in addition to composing views as hierarchies on a grid (hconcat, vconcat etc.) you allow a sub-view to be positioned by some characteristic of a parent's position. The most obvious might be its centroid which can therefore be independent of coordinate system.
Yep, this is another possible way to consider it. Vega group marks support this well, though we have to think about whether this fits well at the Vega-Lite abstraction level.
The problem I see with that is that eventually you'd have to do a separate set of parameters for each coordinate system. How about spherical coordinates, for example? Would you introduce rho and theta with a special spherical triangle or hexagon tile mark? The way I viewed it in GG was to make x and y and z to be "roles" (what vega-lite is doing with channels) that were assigned to parameters in coordinate systems. Then marks applied to every coordinate system (squares on the surface of the globe, lines as great circle routes, etc.) It made things easy for adding new coordinate systems, as we did in nViZn.
This is a very good point and it is the part that I really appreciate about the original GG. I think once we implement other coordinate systems (other than geo projections that we support right now), we should definitely keep this in mind.
That said, it's also worth noting that the arc marks already exist in Vega, so surfacing will make it consistent across our abstraction level.
Plus surfacing arc from Vega may take only several hours to implement whereas general coordinate support, while nice to have, will take at least a few full-time weeks to do.
As we're mostly working on Vega(-Lite) in our free time (in addition to our main job), we might not have time for the full coordinate support anytime soon.
I'm starting a PR for this: https://github.com/vega/vega-lite/pull/5676. Let's continue our discussion in the PR.
@jwoLondon Can you share the wind direction data you used in https://github.com/vega/vega-lite/issues/408#issuecomment-373870307?
The (fictitious) data were generated programmatically in elm from the following:
windData =
toRose
[ ( "N", 4 )
, ( "NE", 6 )
, ( "E", 5 )
, ( "SE", 3 )
, ( "S", 7 )
, ( "SW", 8 )
, ( "W", 7 )
, ( "NW", 6 )
]
which created the following geoJSON:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[45, -47.43],
[0, -47.43],
[0, 90]
]
]
},
"properties": { "cat": "NW" }
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[90, -68.66],
[45, -68.66],
[0, 90]
]
]
},
"properties": { "cat": "W" }
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[135, -89.9],
[90, -89.9],
[0, 90]
]
]
},
"properties": { "cat": "SW" }
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[180, -68.66],
[135, -68.66],
[0, 90]
]
]
},
"properties": {
"cat": "S"
}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[225, 16.29],
[180, 16.29],
[0, 90]
]
]
},
"properties": {
"cat": "SE"
}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[270, -26.19],
[225, -26.19],
[0, 90]
]
]
},
"properties": {
"cat": "E"
}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[315, -47.43],
[270, -47.43],
[0, 90]
]
]
},
"properties": {
"cat": "NE"
}
},
{
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
[0, 90],
[360, -4.95],
[315, -4.95],
[0, 90]
]
]
},
"properties": {
"cat": "N"
}
}
]
}
Although presumably you don't need the geoJSON route following the release of VL 4.9.
@jwoLondon Is it possible to share your json spec used to build the chart Wind Direction 2015 - 2017 ? It will be usefull for https://github.com/vega/vega-lite/issues/9079 Thanks a lot.
Is full polar coordinate support required for pie charts? It seems like we could just generate a vega pie transform with the startAngle/endAngle mapped to the color channel.