microsoft / datamations

https://microsoft.github.io/datamations/
Other
67 stars 14 forks source link

Add custom aggregation animations #18

Closed jhofman closed 2 years ago

jhofman commented 3 years ago

Right now mean shows points collapsing. Here are suggestions for how other aggregation operations can be animated: https://idl.cs.washington.edu/files/2019-AnimatedAggregates-EuroVis.pdf

giorgi-ghviniashvili commented 3 years ago

@jhofman Have read the papers.

Just forked this repo and created a demo of designs folder.

These examples of transitions are great. With this kind of transitions, it really makes more sense than just linear interpolation from point A to B.

I don't think that gemini or any other library will support this kind of custom transitions. We need to code these ourselves, as these guys did.

giorgi-ghviniashvili commented 3 years ago

Take a look at this one as well: demo

jhofman commented 3 years ago

@giorgi-ghviniashvili: these are great. you commented that gemini won't support these. do you think we can do them with d3 on top of vegalite plots, or do the plots themselves have to be entirely done in d3?

@dggoldst, take a look at these two demos for different ways to visualize different types of aggregation functions used to summarize data:

https://giorgi-ghviniashvili.github.io/aggregate-animation-data/designs/apps https://giorgi-ghviniashvili.github.io/aggregate-animation-data/designs/

giorgi-ghviniashvili commented 3 years ago

These demos are done with d3 because it is more flexible (low level access of transitions, scales, drawing part) than vega.

But after a bit playing with vega, I think we can do the same with vega, but we will need multiple vega specs: a single one for each simple step.

For example, for arithmetic mean, we have multiple steps and each step introduces new elements or instructions: an area or lines, circle transforms. All these will need new vega specs. With gemini sync we can only instruct animation sequence of elements that need to be transformed.

To sum up, I am not sure how smooth the animation will be with gemini and vega and how easy it will be to generate all the specs, but think that it is possible.

giorgi-ghviniashvili commented 3 years ago

After playing with Sharla's specs and drawing grid, I think it is easy to customize each frame as we want: hide axes, change domain, scale, axis types..

jhofman commented 3 years ago

But after a bit playing with vega, I think we can do the same with vega, but we will need multiple vega specs: a single one for each simple step.

For example, for arithmetic mean, we have multiple steps and each step introduces new elements or instructions: an area or lines, circle transforms. All these will need new vega specs. With gemini sync we can only instruct animation sequence of elements that need to be transformed.

To sum up, I am not sure how smooth the animation will be with gemini and vega and how easy it will be to generate all the specs, but think that it is possible.

Got it. So seems like going with gemini and vega wouldn't make it impossible to implement these transitions down the line if we wanted to later, but it might be a lot of work.

Do you think it's worth trying this approach for just one aggregation operation like count or median to see how it goes?

giorgi-ghviniashvili commented 3 years ago

Got it. So seems like going with gemini and vega wouldn't make it impossible to implement these transitions down the line if we wanted to later, but it might be a lot of work.

Yes it is not impossible, but lots of work. Agree.

Do you think it's worth trying this approach for just one aggregation operation like count or median to see how it goes?

Yes, I think it worths trying.

jhofman commented 3 years ago

Snoozing this, but when we do get to it, we'll try Gemini2 as per #69.

jhofman commented 2 years ago

@sharlagelfand will explore doing a custom animation for mean, which requires parsing the summarize function a bit more closely, then @giorgi-ghviniashvili can prototype it.

sharlagelfand commented 2 years ago

I've updated the summary function parsing so that if the summary function is mean or median, the first spec produced by prep_spec_summarize will contain meta.custom_animation = "mean" (or median).

fyi @chisingh this is something that should be added on the python side as well, to ensure consistency between the specs

jhofman commented 2 years ago

@giorgi-ghviniashvili will implement custom versions of the first mean and median animations listed here: https://giorgi-ghviniashvili.github.io/aggregate-animation-data/designs/

giorgi-ghviniashvili commented 2 years ago

@jhofman median animation is there 🔥

https://user-images.githubusercontent.com/6615532/149820520-83319a4b-7e52-4b3a-8658-5a7fbdb5f382.mov

P.S. there is a difference between count and median: median needs to have some initial y values to be sorted by, while count does not need that and can be calculated from grid spec. I added an intermediate frame with randomized data after grid.

giorgi-ghviniashvili commented 2 years ago

@jhofman mean animation is there 🔥

https://user-images.githubusercontent.com/6615532/149934277-3f57e451-a348-4b47-8abc-f01d69d4802d.mov

jhofman commented 2 years ago

the mean animation looks great!

two small tweaks:

  1. plot the actual values on the y axis right after the grid
  2. make a smooth transition from the vertically stacked y values to the diagonal sorted y values

median is also really cool.

tweaks:

  1. should read "plot median" instead of "plot count"
  2. once the median is calculated, have the original points disappear first, then zoom the axis range (to mirror what we used to have with non-custom animation)
jhofman commented 2 years ago

after that, min and max would be the next obvious custom animations to implement.

giorgi-ghviniashvili commented 2 years ago

updated median animation:

https://user-images.githubusercontent.com/6615532/150313094-e17f0faf-0ef2-4fea-b293-2b1985df7a4a.mov

updated mean animation:

https://user-images.githubusercontent.com/6615532/150316905-eb0fa5d6-5ab1-419f-adc3-e358b6346640.mov

giorgi-ghviniashvili commented 2 years ago

Min:

https://user-images.githubusercontent.com/6615532/150326462-0570f8cb-6abb-4646-ba87-2cce74772927.mov

Max:

https://user-images.githubusercontent.com/6615532/150326497-f8ab0163-1883-4096-8d52-2d6f48980c00.mov

jhofman commented 2 years ago

great!

a few tweaks we discussed:

also, let's check that jitter is working fine when these specs are generated from R code and see how it looks. @giorgi-ghviniashvili, can you create a video of it with jitter so we can see if it looks weird or not?

also, we talked about count always doing an info grid (even if previous frame shows continuous values), which seems fine for now but we can revisit if needed.

side note, steps for debugging R to get specs are:

library(datamations)
library(dplyr)
debug(datamations::datamation_sanddance)
"small_salary %>% group_by(Degree) %>% summarise(mean = mean(Salary))" %>% datamation_sanddance()
# step through code until second to last line of function
clipr::write_clip(res)

(there's probably a better way, but this works at least.)

giorgi-ghviniashvili commented 2 years ago

Min and max fades out.

https://user-images.githubusercontent.com/6615532/151397802-e38b5d84-9bc8-4690-8196-faea4453834c.mov

jhofman commented 2 years ago

Nice on the fade out.

It looks like some of the points move before fading out, such as on the lower left between the 2 and 3 second mark. Any idea what's up there?

giorgi-ghviniashvili commented 2 years ago

@jhofman I noticed it and fixed. It was missing gemini_id in change.data. Gemini recommend was not setting it by itself.

image

Min: https://user-images.githubusercontent.com/6615532/151529100-4cf3493c-0578-4e40-a232-2336feddbbdf.mov

Max: https://user-images.githubusercontent.com/6615532/151529219-2633f898-1a8b-41da-ad29-30d510e34942.mov

P.S. the code for all custom animations is in this branch

giorgi-ghviniashvili commented 2 years ago

Jitter works. Couple of comments though.

Here is median and quantile with jitter: https://user-images.githubusercontent.com/6615532/151532891-63aec884-1f6e-4969-8260-8dc8ff527bfa.mov

Mean + jitter: https://user-images.githubusercontent.com/6615532/151534558-e524cc4a-76dd-43e9-8f66-a9d062de664f.mov

Min + jitter: (max is same, but lines at the top) https://user-images.githubusercontent.com/6615532/151534967-639898ff-353e-457c-bb60-c3b87f84192f.mov

jhofman commented 2 years ago

@willdebras next step on this is related to #137, which is to parse and pass more custom functions in the vegalite spec.

For instance, right now we definitely have meta.custom_animation = "count" and possibly have meta.custom_animation = "mean" being added to vegalite specs for different steps, but most likely we don't have min, max, median, or quantile.

It would be nice to have a generic function parsing mechanism of the following type:

df %>%
  group_by(x) %>%
  summarize(z = f(y, a, b, ...))

where you could pull out that the function being called is f, the variable being summarized is y, and the extra parameters being passed are a, b, etc. (this comes up in something like quantile(y, 0.1))

My guess is that this could go in prep_specs_summarize.R or possibly parse_functions.R, but let's see.

Also reminder that we want to reflect the summary operation in the title.

jhofman commented 2 years ago

p.s. @willdebras, see this comment for one thought on how to debug things, happy to know if there's a better way (i imagine there is):

https://github.com/microsoft/datamations/issues/18#issuecomment-1017611841

jhofman commented 2 years ago

@giorgi-ghviniashvili, can you test the custom animations with facets to make sure everything works?

willdebras commented 2 years ago

p.s. @willdebras, see this comment for one thought on how to debug things, happy to know if there's a better way (i imagine there is):

#18 (comment)

Right now in prep_specs_summarize.R the meta.custom_animation is passed straight from the mappings and description:

  if (mapping$summary_function %in% c("mean", "median")) {
    spec[["meta"]][["custom_animation"]] <- mapping$summary_function
  }

It will be pretty straightforward to update the meta specs here to just include more summary functions from the summary_function mapping.

For the generic function parsing, I have a good sense of implementing here. The fittings object toward the beginning of datamations_sanddance() actually already parses these pretty well, e.g. the trim parameter call to median, so it's not a huge lift to pass these into the specs.

image

So to implement this, I am curious where we want these to end up in the specs that get passed to vegalite? Should these additional parameters be passed to the mapping directly or end up in meta specs? I see for the quantile issue, we are expecting just a string like "quantile(0.10)", but if we want this generic parsing, should it end up a list with named values?

This would change a bit the approach here, i.e. whether I change the args passed to parse_functions.R or generate_mapping.R or just add new definitions for to the meta list.

willdebras commented 2 years ago

The fittings object though does not return the name of the arg if the name isn't explicitly provided, e.g. mean(x, 0.2) v mean(x, trim = 0.2). I think we can parse the result of calling base::args() or base::formals() on the summary function, e.g. args(mean) (which returns $x, $trim, etc.) to fill these in and provide to the vegalite specs.

giorgi-ghviniashvili commented 2 years ago

@giorgi-ghviniashvili, can you test the custom animations with facets to make sure everything works?

@jhofman tested and it does not work with facets unfortunately. We will need some more time on this to make it work in facets. I am not sure if it will work with gemini or we will need some more "hacks".

In addition to that, I fixed some of the faceted view issues: When using these specs, we need to remove facet.column.sort and facet.row.sort.

Other fixes done in js side, there was facet alignment issues on error bars.

jhofman commented 2 years ago

For now let's keep the custom_animation field as a simple string (instead of a more generic dictionary or something like that), because we don't have a correspondence between more complicated sets of functions or function arguments and visual states that need to be rendered. If/when that changes we can revisit.

So for now we'll do:

custom_animation = "count"
custom_animation = "mean"
custom_animation = "median"
custom_animation = "min"
custom_animation = "max"
custom_animation = "quantile(0.10)"

@willdebras can put some example specs for each in sandbox/ generated by R for @giorgi-ghviniashvili to try (as custom-animation-{function}-R.json), and @giorgi-ghviniashvili can put the hand-generated specs he was prototyping with there as well for @willdebras to see (as custom-animation-{function}-manual.json).

giorgi-ghviniashvili commented 2 years ago

I created custom_animations folder and put the json spects there.

@willdebras please notice difference between count and other type of custom animation specs:

willdebras commented 2 years ago

Awesome, thanks @giorgi-ghviniashvili. These make sense to me. I will add R generated specs in there for comparison tonight (tomorrow for you).

willdebras commented 2 years ago

I added count, min, max, median, and mean example specs.

I believe these count specs are what youa re expecting, i.e. data.values gives a key value pair of n and a count, but let me know if I am off base.

Quantile is a bit tricky and will need some updates to prep_specs_summarize.R. datamations_sanddance() actually breaks with quantile passed as a summary function. across has a hard time applying quantile without a given probs parameter (e.g. 0.1).

It breaks on this call. https://github.com/microsoft/datamations/blob/main/R/prep_specs_summarize.R#L477

While I have code in place to pass the custom animation meta specs, I need to make amends to the mappings passed to this function for the data to even generate for quantile. I'll work on this tomorrow so we can get this running for quantile.

giorgi-ghviniashvili commented 2 years ago

@willdebras scale.domain must be [0, 3], for 0.5 and 2.5 it has alignment issues.. This is true for all jsons.

image

Title should not be an array: [], either string or should not be present at all.

image

Please do not include color as encoding if field is null:

image

For min, max and median, I think we don't need last spec because custom-animation already does that , plots min, max and median zoomed in as last step.

Other than that, they look good. Please let me know when these are fixed and I will re-test.

willdebras commented 2 years ago

Sounds good.

Updates here.

For the min, max, and median I still have the data states for the summary function in the end (i.e. the final spec). Should I just remove these entirely then? It will take some additional handling to not generate the summarize specs (i.e. the final summary function specs) for only specific summary functions. I can bake this in this week as well as the quantile updates.

giorgi-ghviniashvili commented 2 years ago

@jhofman custom animations with facets are now possible. Did an example for mean:

https://user-images.githubusercontent.com/6615532/153209597-9ff7986d-d85e-4859-a0c1-8c348b675c9e.mov

But there is a issue when next spec is with error bars.. Will further debug and find out how to solve it.

@willdebras yes, please just remove summarized specs for now and let's test. (comment it out, we may need it later)

jhofman commented 2 years ago

Looks like between 14 seconds and 19 seconds in there's a shift of all points to the left.

Next steps will be to get the error bars and zoom steps working.

jhofman commented 2 years ago

@giorgi-ghviniashvili made some good progress on this, the shifting to the left is fixed but there are still some details to work out in the final frames of the custom animation.

also, there's an interesting thing that happens when we have overlapping values on the quantile (or median) custom animations---it becomes difficult to see the overlap and then it sort of visually looks like you're cutting the data at a different point than is specified in the quantile function. (it's actually doing the right thing, it just looks funky.)

i wonder if doing something more like mean where things are diagonal so that all points can be seen would be useful? then we could move the sliding bar up from the bottom to the appropriate percentile?

let's work on this the week after next.

giorgi-ghviniashvili commented 2 years ago

Updates on faceted custom animations:

mean:

https://user-images.githubusercontent.com/6615532/153747020-05a42f36-7f9b-4dcb-b658-e37a2dd890ac.mov

Max:

https://user-images.githubusercontent.com/6615532/153746461-43fee5d1-d86f-4de8-a2a5-51388a4b1387.mov

Min:

https://user-images.githubusercontent.com/6615532/153746517-492111f4-5f0f-44fa-80f5-36aaa7e3ff99.mov

Median and quantile: for some reason, gemini.recommendForSeq can not recommend the gemini animation specs and does not work. Might need to investigate further and/or file a ticket on gemini's github.

Count: @willdebras please provide faceted view for count. And also in general, please add facets + custom_animation specs to sandbox to be able to test.

giorgi-ghviniashvili commented 2 years ago

@willdebras can you push R generated custom-animations-median-facet.json to custom_animations branch? I would like to test median spec. Manually generated spec does not work, don't know why

giorgi-ghviniashvili commented 2 years ago

Update: made faceted custom animations work with median and quantile. (I needed to make some tricks!)

https://user-images.githubusercontent.com/6615532/155701952-3acc9c7c-d377-42b3-a826-5c5980215023.mov

In summary, I think we are good with custom animations, let's invest some time to test all custom animations using R generated specs and then merge it 🤞🤞

willdebras commented 2 years ago

Awesome!!

R generated specs here for custom animations if you still need them: https://github.com/microsoft/datamations/blob/custom_animations/sandbox/custom_animations/custom-animations-median-faceted-R.json

Sounds good on testing then merge!

jhofman commented 2 years ago

this looks terrific.

now that we have colors to denote groups, it's a bit jarring to see the green and yellow come in on the median step in this video.

let's simplify things and just keep the group colors and forget the gray/green/yellow. so in this case, female points stay all orange, male points stay red, and NA stay blue.

probably a good idea to propagate this to other custom animations. if it's possible to keep the colors on the bars for mean, then great. but if complicated we can skip it.

giorgi-ghviniashvili commented 2 years ago

@willdebras I tested all the specs and they work great, except the custom-animations-binary-R.json which has mean animation. The problem is that we have mean animation directly after the grid, can custom animation spec be after jittered spec?

willdebras commented 2 years ago

So right now it is usually in the set of specs directly after the jitter spec, right?

https://github.com/microsoft/datamations/blob/custom_animations/sandbox/custom_animations/custom-animations-median-faceted-R.json#L3544

The jitter spec is generated in the group_by state. Currently the custom animation is always getting applied in the first summarize spec. This binary file doesn't produce any jittered specs because the binary variables I believe are always depicted in a grid. Do we need to add a jitter spec anywhere?

giorgi-ghviniashvili commented 2 years ago

Ah yes, you are right. So to make mean work after grid spec, we need to sort it first and then translate. Will try to fix that on my side.

giorgi-ghviniashvili commented 2 years ago

@willdebras all custom animations (except count) should always be after a spec where each datapoint has y value. That's needed because first step is to animate points to slash shape (/):

image

--- then draw bars with mean lines and then collapse.

In case of binary variables, we only have grid, we don't have y values for each datapoints. Instead a player has a batting average in the year.

I think that we need to show jittered spec after grid and before custom_animation or need to think of a different animation, especially the "slash shape step".

I tried to stack the circles to get the transform like that, but because we have so many points, they overlap and not really clear what's going on.

image
jhofman commented 2 years ago

i agree, i don't think custom animations for binary variables make sense, at least not for mean.

@giorgi-ghviniashvili can you work on the color issues above for next meeting?

giorgi-ghviniashvili commented 2 years ago

@jhofman fixed color issue:

https://user-images.githubusercontent.com/6615532/157484711-a7be79b9-9d74-49ef-99a5-b20b3e29d42e.mov

https://user-images.githubusercontent.com/6615532/157484741-622af92b-d5b9-4543-b213-a87c0cd939e1.mov

https://user-images.githubusercontent.com/6615532/157485191-812c96a7-4cc9-4d10-9efe-a369ebdabfe9.mov

willdebras commented 2 years ago

@giorgi-ghviniashvili The binary specs have been updated to remove custom animations meta spec. All binary specs now exclude this meta spec:

https://github.com/microsoft/datamations/blob/custom_animations/sandbox/custom_animations/custom-animations-binary-R.json

jhofman commented 2 years ago

the mean animation looks great.

the median animation has a jump after the medians are calculated from 7 to 8 seconds. maybe this is just a problem w/ the spec?

the max animation looks good until the very end when some ghost points appear below the correct points.