microsoft / datamations

https://microsoft.github.io/datamations/
Other
67 stars 14 forks source link

Offset infogrid / jitter using X, without color #61

Closed sharlagelfand closed 3 years ago

sharlagelfand commented 3 years ago

Seems like infogrid / jitter case can only handle when the offsetting variable is color:

"small_salary %>%
    group_by(Degree, Work) %>%
    summarize(mean_salary = mean(Salary, na.rm = TRUE)) %>%
    ggplot(aes(x = Degree, y = mean_salary)) +
    geom_point() +
    facet_grid(~ Work)" %>%
  datamation_sanddance() 

Specs here.

https://user-images.githubusercontent.com/15895337/119526361-528c7400-bd4d-11eb-85df-d26b0acad222.mov

The above R code says nothing about color, and really should only be offsetting by X, but this doesn't work (I manually added color in to the specs above)

https://user-images.githubusercontent.com/15895337/119526824-badb5580-bd4d-11eb-82f3-93086eeb5d1e.mov

Specs here. And here is the jitter frame to show how it should look (the summary frame is correct, but jitter is not, neither is infogrid).

cc @giorgi-ghviniashvili

giorgi-ghviniashvili commented 3 years ago

@sharlagelfand yes, that's right, it only shifts on color. Could you please include jitterField in the meta? For example, for color, meta will be: { parse: "jitter", jitterField: "sex" }, for degree split (current frame), it will be { parse: "jitter", jitterField: "x" }. If jitterField is not available or is null, it will not split points and will treat like a single group.

Here is example if jitterField is x:

image

giorgi-ghviniashvili commented 3 years ago

Or maybe let's call it splitField or offsetField?

giorgi-ghviniashvili commented 3 years ago

Ok, using splitField for both parse:jitter and parse:grid.

sharlagelfand commented 3 years ago

Thanks @giorgi-ghviniashvili! This works well for this first example:

https://user-images.githubusercontent.com/15895337/119843642-bcd02080-bed5-11eb-801a-e2de4a097681.mov

Though I realized that now, without the colour, in the third frame (~5 seconds in) we have nothing to distinguish masters from phd, might be something for us to keep in mind re: adding an axis in to distinguish (cc @jhofman).

I'm looking back on the other example from #40 here, and I think the second row of the grid needs to be offset so that it matches up with the jitter in the next frame:

"small_salary %>%
  group_by(Degree, Work) %>%
  summarize(mean_salary = mean(Salary, na.rm = TRUE)) %>%
  ggplot(aes(x = Degree, y = mean_salary)) +
  geom_point() +
  facet_grid(Degree ~ Work)" %>%
  datamation_sanddance()

https://user-images.githubusercontent.com/15895337/119844516-7e873100-bed6-11eb-857a-74951f5357c9.mov

Here are the specs for that.

giorgi-ghviniashvili commented 3 years ago

@sharlagelfand about the third frame, yes we will need x-axis and then instead of this:

image

We can use this (which is already offset and matches next jitter frame):

image

sharlagelfand commented 3 years ago

@giorgi-ghviniashvili the goal is to still have both column and row facets, offset by X, not to translate into column facets and x values only. We want a user to be able to specify the final form so if they specify col + row facets, with X offset, we should render that

giorgi-ghviniashvili commented 3 years ago

@sharlagelfand so do you mean in this frame grid need to be offseted? Can you clarify how they will look? Can we first add x-axis?

image

sharlagelfand commented 3 years ago

Yes, exactly @giorgi-ghviniashvili

The grid looks like this

Screen Shot 2021-05-28 at 10 41 47 AM

and the next frame looks like this

Screen Shot 2021-05-28 at 10 42 00 AM

which is confusing, because the transition looks like this

Screen Shot 2021-05-28 at 10 42 10 AM

so the first frame needs to be offset as well so that the line up in the transition and are in the correct place.

re: adding axes, that was my question in this comment about whether it might make sense to have (just x) axes in the grid as well.

giorgi-ghviniashvili commented 3 years ago

@sharlagelfand that's already achievable.

In third frame here, please remove splitField (because you are not actually splitting within groups). SplitField should only be used when you want to further split a single grid specified in data.values with n.

And to shift it, please modify data.values like this:

image

(70 means shift by 70 grid points, which will be 7 columns). (90 = 90 points, 9 columns).

"So a single column consists of 10 points.."

It will render this: image

Let me know if that works.

giorgi-ghviniashvili commented 3 years ago

https://user-images.githubusercontent.com/6615532/120077510-c74c0f00-c0bb-11eb-9b5b-27e1e9d736e3.mov

sharlagelfand commented 3 years ago

@giorgi-ghviniashvili I'm hesitant to introduce any shiftX logic into the R side, since it's all being calculated on the JS side right now - we would be duplicating all the logic that already exists for calculating this into R and if there are any changes (e.g. if we want to change how many points are in a column) it would have to be done in two places. What do you think about handling adding that in on your end?

There is another case that I need to think about for this, e.g. if Work is in column facets, Degree is in row facets, and Work is on the X axis, how it should look. I'll put together some specs / a sketch for this

sharlagelfand commented 3 years ago

Here is a sample spec for the example I described above, where Work is in column facets, Degree is in row facets, and Work is on the x-axis, created with this code:

small_salary %>%
  group_by(Degree, Work) %>%
  summarize(mean_salary = mean(Salary, na.rm = TRUE)) %>%
  ggplot(aes(x = Work, y = mean_salary)) +
  geom_point() +
  facet_grid(Degree ~ Work)

Here are what the frames look like:

Screen Shot 2021-05-31 at 11 06 43 AM

Screen Shot 2021-05-31 at 11 06 48 AM'

With the animation between them looking like this:

Screen Shot 2021-05-31 at 11 06 59 AM

Ideally, to be consistent with the final plot, the infogrid for the right column should have those points "offset" by the width of the points in the first column

sharlagelfand commented 3 years ago

Not sure if this is related to this (I'm losing track a bit of which bugs fit where!) but this spec has its X offset for grid working great:

Screen Shot 2021-05-31 at 1 54 36 PM

but the axes on the jitter is off

Screen Shot 2021-05-31 at 1 50 46 PM

The spec in its own looks fine, and in the next frames, just something happening with the axes I think (meta.axes is set to FALSE)

Screen Shot 2021-05-31 at 1 51 17 PM
giorgi-ghviniashvili commented 3 years ago

@sharlagelfand please update link of "this spec" , it does not link anything.

giorgi-ghviniashvili commented 3 years ago

@sharlagelfand agree that it should be done in JS side, let me think a bit how we can solve this, most likely we will need a new field in meta to find out when should js do it.

sharlagelfand commented 3 years ago

Oops sorry @giorgi-ghviniashvili! Spec updated in that comment now

giorgi-ghviniashvili commented 3 years ago

Ok, that works because we don't have facets and grid, we only have one group of data which is split by spitField and jittered. Works great.

Because jitter, domain is adjusted while axis stays same, so I will need ['Academia', 'Industry'] to be added to somewhere to be able to correctly position them.

image

For Academia, I will set it to 75 and for Industry I will set it to 225. NO HARDCODE.

P.S. I could get it using splitField, but can't make it correctly sorted.

giorgi-ghviniashvili commented 3 years ago

image

giorgi-ghviniashvili commented 3 years ago

We can add it as meta.xAxisLabels, done it here.

giorgi-ghviniashvili commented 3 years ago

There is another case that I need to think about for this, e.g. if Work is in column facets, Degree is in row facets, and Work is on the X axis, how it should look. I'll put together some specs / a sketch for this

I think having column facets and x axis is same fields is very confusing, because In Academia, you have Academia and Industry, same for Industry as facet. In general, I think Column facet field should not match x axis field.

giorgi-ghviniashvili commented 3 years ago

And about this one:

image

When you add meta.xAxisLabels, I will try to add shiftX myself matching xAxisLabels array.

For this to know when to do please add meta.shiftGrids: true (naming suggestions welcome).

sharlagelfand commented 3 years ago

We can add it as meta.xAxisLabels, done it here.

This info is already contained in the labelExpr - is the reason it can't be pulled from there because of the ordering?

I think having column facets and x axis is same fields is very confusing, because In Academia, you have Academia and Industry, same for Industry as facet. In general, I think Column facet field should not match x axis field.

I agree, I think it looks pretty bad and confusing :) I don't necessarily want to encourage it, but if we want to take ggplot2 code as input to specify how the plot looks (#40) , then that's a case that should be handled

giorgi-ghviniashvili commented 3 years ago

This info is already contained in the labelExpr - is the reason it can't be pulled from there because of the ordering?

No, I don't want parse the expression.. What if we need another expression or expression is not correct at all? Better to put it in meta and also use it for shift to match facet with x-axis.

And also ordering issue if we parse it.

sharlagelfand commented 3 years ago

Fair! It's probably easier to pull it out in the correct order in R and just pass it along.

To confirm... meta.xAxisLabels should be set in the jitter spec when parse: jitter, splitField: true, axes: false (if axes: true it doesn't need to be set, because these are hacked facets which already handles all of the x axis labelling)

and then also set in cases where the infogrid needs to be split, so parse: grid, splitField: x, and shiftGrids: true also include meta.xAxisLabels (both cases when x is the same as column facet or when x is the same as row facet)?

Once we have this all sorted... I will make a table of all the meta settings and when they are flagged / how to use them :)

giorgi-ghviniashvili commented 3 years ago

meta.xAxisLabels should be set in the jitter spec when parse: jitter, splitField: true, axes: false (if axes: true it doesn't need to be set, because these are hacked facets which already handles all of the x axis labelling)

Yes, correct!

and then also set in cases where the infogrid needs to be split, so parse: grid, splitField: x, and shiftGrids: true also include meta.xAxisLabels (both cases when x is the same as column facet or when x is the same as row facet)?

In this case, please remove splitField, because in case of info-grid, we are not actually splitting anything.

Once we have this all sorted... I will make a table of all the meta settings and when they are flagged / how to use them :)

Yes, we need it for sure

sharlagelfand commented 3 years ago

Ok awesome - I will update it hopefully later today or tomorrow!

For the splitField in info grid - do we still need it in the cases when it is split? If the grid is split by a variable in color for example

giorgi-ghviniashvili commented 3 years ago

For the splitField in info grid - do we still need it in the cases when it is split? If the grid is split by a variable in color for example

Yes, we will need it. So a single object in data.values is a single grid which will be generated. If we want to further split it, we need splitField.

If we don't need to split, but shift, we should not put splitField, but shiftGrids to distinguish them from each other.

sharlagelfand commented 3 years ago

Thanks @giorgi-ghviniashvili, adding xAxisLabels works great for this example:

https://user-images.githubusercontent.com/15895337/120347286-24b5ab00-c2ca-11eb-84f3-46af377dfa60.mov

I'm looking at adding shiftGrids in, it hasn't been added in on your end right? Here is a spec that contains it that we can test with.

giorgi-ghviniashvili commented 3 years ago

@sharlagelfand can we please start with a situation when column facet and x-axis are different? In this spec, they both are same and it is ambiguous for me

jhofman commented 3 years ago

We realized we were in the weeds on edge cases here, so let's settle some conventions in #68 first before continuing on this.

@giorgi-ghviniashvili: side notes are that we'd like some more labeling of infogrid groups when there's a non-obvious split (by perhaps turning on x axis ticks), and maybe we should always center-justify infogrids to match the center-justification of jitter plots in following frames, so that center justify is the new default?

giorgi-ghviniashvili commented 3 years ago

@jhofman I think that this one should be closed, since we continue working in #68.

jhofman commented 3 years ago

sounds good!