Closed jhofman closed 2 years ago
@jhofman Have read the papers.
Just forked this repo and created a demo of designs
folder.
These examples of transitions are great. With this kind of transitions, it really makes more sense than just linear interpolation from point A to B.
I don't think that gemini
or any other library will support this kind of custom transitions. We need to code these ourselves, as these guys did.
Take a look at this one as well: demo
@giorgi-ghviniashvili: these are great. you commented that gemini won't support these. do you think we can do them with d3 on top of vegalite plots, or do the plots themselves have to be entirely done in d3?
@dggoldst, take a look at these two demos for different ways to visualize different types of aggregation functions used to summarize data:
https://giorgi-ghviniashvili.github.io/aggregate-animation-data/designs/apps https://giorgi-ghviniashvili.github.io/aggregate-animation-data/designs/
These demos are done with d3 because it is more flexible (low level access of transitions, scales, drawing part) than vega.
But after a bit playing with vega, I think we can do the same with vega, but we will need multiple vega specs: a single one for each simple step.
For example, for arithmetic mean, we have multiple steps and each step introduces new elements or instructions: an area or lines, circle transforms. All these will need new vega specs. With gemini sync
we can only instruct animation sequence of elements that need to be transformed.
To sum up, I am not sure how smooth the animation will be with gemini and vega and how easy it will be to generate all the specs, but think that it is possible.
After playing with Sharla's specs and drawing grid, I think it is easy to customize each frame as we want: hide axes, change domain, scale, axis types..
But after a bit playing with vega, I think we can do the same with vega, but we will need multiple vega specs: a single one for each simple step.
For example, for arithmetic mean, we have multiple steps and each step introduces new elements or instructions: an area or lines, circle transforms. All these will need new vega specs. With gemini
sync
we can only instruct animation sequence of elements that need to be transformed.To sum up, I am not sure how smooth the animation will be with gemini and vega and how easy it will be to generate all the specs, but think that it is possible.
Got it. So seems like going with gemini and vega wouldn't make it impossible to implement these transitions down the line if we wanted to later, but it might be a lot of work.
Do you think it's worth trying this approach for just one aggregation operation like count or median to see how it goes?
Got it. So seems like going with gemini and vega wouldn't make it impossible to implement these transitions down the line if we wanted to later, but it might be a lot of work.
Yes it is not impossible, but lots of work. Agree.
Do you think it's worth trying this approach for just one aggregation operation like count or median to see how it goes?
Yes, I think it worths trying.
Snoozing this, but when we do get to it, we'll try Gemini2 as per #69.
@sharlagelfand will explore doing a custom animation for mean, which requires parsing the summarize function a bit more closely, then @giorgi-ghviniashvili can prototype it.
I've updated the summary function parsing so that if the summary function is mean or median, the first spec produced by prep_spec_summarize
will contain meta.custom_animation = "mean"
(or median).
fyi @chisingh this is something that should be added on the python side as well, to ensure consistency between the specs
@giorgi-ghviniashvili will implement custom versions of the first mean and median animations listed here: https://giorgi-ghviniashvili.github.io/aggregate-animation-data/designs/
@jhofman median animation is there 🔥
https://user-images.githubusercontent.com/6615532/149820520-83319a4b-7e52-4b3a-8658-5a7fbdb5f382.mov
P.S. there is a difference between count and median: median needs to have some initial y
values to be sorted by, while count
does not need that and can be calculated from grid spec. I added an intermediate frame with randomized data after grid.
@jhofman mean animation is there 🔥
https://user-images.githubusercontent.com/6615532/149934277-3f57e451-a348-4b47-8abc-f01d69d4802d.mov
the mean animation looks great!
two small tweaks:
median is also really cool.
tweaks:
after that, min and max would be the next obvious custom animations to implement.
updated median animation:
https://user-images.githubusercontent.com/6615532/150313094-e17f0faf-0ef2-4fea-b293-2b1985df7a4a.mov
updated mean animation:
https://user-images.githubusercontent.com/6615532/150316905-eb0fa5d6-5ab1-419f-adc3-e358b6346640.mov
great!
a few tweaks we discussed:
also, let's check that jitter is working fine when these specs are generated from R code and see how it looks. @giorgi-ghviniashvili, can you create a video of it with jitter so we can see if it looks weird or not?
also, we talked about count always doing an info grid (even if previous frame shows continuous values), which seems fine for now but we can revisit if needed.
side note, steps for debugging R to get specs are:
library(datamations)
library(dplyr)
debug(datamations::datamation_sanddance)
"small_salary %>% group_by(Degree) %>% summarise(mean = mean(Salary))" %>% datamation_sanddance()
# step through code until second to last line of function
clipr::write_clip(res)
(there's probably a better way, but this works at least.)
Nice on the fade out.
It looks like some of the points move before fading out, such as on the lower left between the 2 and 3 second mark. Any idea what's up there?
@jhofman I noticed it and fixed. It was missing gemini_id
in change.data
. Gemini recommend was not setting it by itself.
Min: https://user-images.githubusercontent.com/6615532/151529100-4cf3493c-0578-4e40-a232-2336feddbbdf.mov
Max: https://user-images.githubusercontent.com/6615532/151529219-2633f898-1a8b-41da-ad29-30d510e34942.mov
P.S. the code for all custom animations is in this branch
Jitter works. Couple of comments though.
Here is median and quantile with jitter: https://user-images.githubusercontent.com/6615532/151532891-63aec884-1f6e-4969-8260-8dc8ff527bfa.mov
Mean + jitter: https://user-images.githubusercontent.com/6615532/151534558-e524cc4a-76dd-43e9-8f66-a9d062de664f.mov
Min + jitter: (max is same, but lines at the top) https://user-images.githubusercontent.com/6615532/151534967-639898ff-353e-457c-bb60-c3b87f84192f.mov
@willdebras next step on this is related to #137, which is to parse and pass more custom functions in the vegalite spec.
For instance, right now we definitely have meta.custom_animation = "count"
and possibly have meta.custom_animation = "mean"
being added to vegalite specs for different steps, but most likely we don't have min
, max
, median
, or quantile
.
It would be nice to have a generic function parsing mechanism of the following type:
df %>%
group_by(x) %>%
summarize(z = f(y, a, b, ...))
where you could pull out that the function being called is f
, the variable being summarized is y
, and the extra parameters being passed are a
, b
, etc. (this comes up in something like quantile(y, 0.1)
)
My guess is that this could go in prep_specs_summarize.R
or possibly parse_functions.R
, but let's see.
Also reminder that we want to reflect the summary operation in the title.
p.s. @willdebras, see this comment for one thought on how to debug things, happy to know if there's a better way (i imagine there is):
https://github.com/microsoft/datamations/issues/18#issuecomment-1017611841
@giorgi-ghviniashvili, can you test the custom animations with facets to make sure everything works?
p.s. @willdebras, see this comment for one thought on how to debug things, happy to know if there's a better way (i imagine there is):
Right now in prep_specs_summarize.R
the meta.custom_animation is passed straight from the mappings and description:
if (mapping$summary_function %in% c("mean", "median")) {
spec[["meta"]][["custom_animation"]] <- mapping$summary_function
}
It will be pretty straightforward to update the meta specs here to just include more summary functions from the summary_function mapping.
For the generic function parsing, I have a good sense of implementing here. The fittings object toward the beginning of datamations_sanddance()
actually already parses these pretty well, e.g. the trim
parameter call to median
, so it's not a huge lift to pass these into the specs.
So to implement this, I am curious where we want these to end up in the specs that get passed to vegalite? Should these additional parameters be passed to the mapping directly or end up in meta specs? I see for the quantile issue, we are expecting just a string like "quantile(0.10)", but if we want this generic parsing, should it end up a list with named values?
This would change a bit the approach here, i.e. whether I change the args passed to parse_functions.R
or generate_mapping.R
or just add new definitions for to the meta list.
The fittings object though does not return the name of the arg if the name isn't explicitly provided, e.g. mean(x, 0.2) v mean(x, trim = 0.2). I think we can parse the result of calling base::args() or base::formals()
on the summary function, e.g. args(mean) (which returns $x, $trim, etc.) to fill these in and provide to the vegalite specs.
@giorgi-ghviniashvili, can you test the custom animations with facets to make sure everything works?
@jhofman tested and it does not work with facets unfortunately. We will need some more time on this to make it work in facets. I am not sure if it will work with gemini or we will need some more "hacks".
In addition to that, I fixed some of the faceted view issues:
When using these specs, we need to remove facet.column.sort
and facet.row.sort
.
Other fixes done in js side, there was facet alignment issues on error bars.
For now let's keep the custom_animation
field as a simple string (instead of a more generic dictionary or something like that), because we don't have a correspondence between more complicated sets of functions or function arguments and visual states that need to be rendered. If/when that changes we can revisit.
So for now we'll do:
custom_animation = "count"
custom_animation = "mean"
custom_animation = "median"
custom_animation = "min"
custom_animation = "max"
custom_animation = "quantile(0.10)"
@willdebras can put some example specs for each in sandbox/
generated by R for @giorgi-ghviniashvili to try (as custom-animation-{function}-R.json
), and @giorgi-ghviniashvili can put the hand-generated specs he was prototyping with there as well for @willdebras to see (as custom-animation-{function}-manual.json
).
I created custom_animations folder and put the json spects there.
@willdebras please notice difference between count
and other type of custom animation specs:
count
spec" must be a grid.data.values
of "count
spec" should have datamations_y
equal to count actually, while other types of specs just need real values. Awesome, thanks @giorgi-ghviniashvili. These make sense to me. I will add R generated specs in there for comparison tonight (tomorrow for you).
I added count, min, max, median, and mean example specs.
I believe these count specs are what youa re expecting, i.e. data.values gives a key value pair of n and a count, but let me know if I am off base.
Quantile is a bit tricky and will need some updates to prep_specs_summarize.R. datamations_sanddance() actually breaks with quantile passed as a summary function. across
has a hard time applying quantile without a given probs parameter (e.g. 0.1).
It breaks on this call. https://github.com/microsoft/datamations/blob/main/R/prep_specs_summarize.R#L477
While I have code in place to pass the custom animation meta specs, I need to make amends to the mappings passed to this function for the data to even generate for quantile. I'll work on this tomorrow so we can get this running for quantile.
@willdebras scale.domain
must be [0, 3]
, for 0.5 and 2.5 it has alignment issues.. This is true for all jsons.
Title should not be an array: [], either string or should not be present at all.
Please do not include color as encoding if field is null:
For min, max and median, I think we don't need last spec because custom-animation already does that , plots min, max and median zoomed in as last step.
Other than that, they look good. Please let me know when these are fixed and I will re-test.
Sounds good.
Updates here.
For the min, max, and median I still have the data states for the summary function in the end (i.e. the final spec). Should I just remove these entirely then? It will take some additional handling to not generate the summarize specs (i.e. the final summary function specs) for only specific summary functions. I can bake this in this week as well as the quantile updates.
@jhofman custom animations with facets are now possible. Did an example for mean
:
https://user-images.githubusercontent.com/6615532/153209597-9ff7986d-d85e-4859-a0c1-8c348b675c9e.mov
But there is a issue when next spec is with error bars.. Will further debug and find out how to solve it.
@willdebras yes, please just remove summarized specs for now and let's test. (comment it out, we may need it later)
Looks like between 14 seconds and 19 seconds in there's a shift of all points to the left.
Next steps will be to get the error bars and zoom steps working.
@giorgi-ghviniashvili made some good progress on this, the shifting to the left is fixed but there are still some details to work out in the final frames of the custom animation.
also, there's an interesting thing that happens when we have overlapping values on the quantile (or median) custom animations---it becomes difficult to see the overlap and then it sort of visually looks like you're cutting the data at a different point than is specified in the quantile function. (it's actually doing the right thing, it just looks funky.)
i wonder if doing something more like mean where things are diagonal so that all points can be seen would be useful? then we could move the sliding bar up from the bottom to the appropriate percentile?
let's work on this the week after next.
Updates on faceted custom animations:
mean:
https://user-images.githubusercontent.com/6615532/153747020-05a42f36-7f9b-4dcb-b658-e37a2dd890ac.mov
Max:
https://user-images.githubusercontent.com/6615532/153746461-43fee5d1-d86f-4de8-a2a5-51388a4b1387.mov
Min:
https://user-images.githubusercontent.com/6615532/153746517-492111f4-5f0f-44fa-80f5-36aaa7e3ff99.mov
Median and quantile: for some reason, gemini.recommendForSeq
can not recommend the gemini animation specs and does not work. Might need to investigate further and/or file a ticket on gemini's github.
Count: @willdebras please provide faceted view for count. And also in general, please add facets + custom_animation specs to sandbox
to be able to test.
@willdebras can you push R generated custom-animations-median-facet.json
to custom_animations
branch? I would like to test median spec. Manually generated spec does not work, don't know why
Update: made faceted custom animations work with median and quantile. (I needed to make some tricks!)
https://user-images.githubusercontent.com/6615532/155701952-3acc9c7c-d377-42b3-a826-5c5980215023.mov
In summary, I think we are good with custom animations, let's invest some time to test all custom animations using R generated specs and then merge it 🤞🤞
Awesome!!
R generated specs here for custom animations if you still need them: https://github.com/microsoft/datamations/blob/custom_animations/sandbox/custom_animations/custom-animations-median-faceted-R.json
Sounds good on testing then merge!
this looks terrific.
now that we have colors to denote groups, it's a bit jarring to see the green and yellow come in on the median step in this video.
let's simplify things and just keep the group colors and forget the gray/green/yellow. so in this case, female points stay all orange, male points stay red, and NA stay blue.
probably a good idea to propagate this to other custom animations. if it's possible to keep the colors on the bars for mean, then great. but if complicated we can skip it.
@willdebras I tested all the specs and they work great, except the custom-animations-binary-R.json
which has mean animation. The problem is that we have mean
animation directly after the grid, can custom animation spec be after jittered spec?
So right now it is usually in the set of specs directly after the jitter spec, right?
The jitter spec is generated in the group_by state. Currently the custom animation is always getting applied in the first summarize spec. This binary file doesn't produce any jittered specs because the binary variables I believe are always depicted in a grid. Do we need to add a jitter spec anywhere?
Ah yes, you are right. So to make mean work after grid spec, we need to sort it first and then translate. Will try to fix that on my side.
@willdebras all custom animations (except count) should always be after a spec where each datapoint has y value. That's needed because first step is to animate points to slash shape (/):
--- then draw bars with mean lines and then collapse.
In case of binary variables, we only have grid, we don't have y values for each datapoints. Instead a player has a batting average in the year.
I think that we need to show jittered spec after grid
and before custom_animation
or need to think of a different animation, especially the "slash shape step".
I tried to stack the circles to get the transform like that, but because we have so many points, they overlap and not really clear what's going on.
i agree, i don't think custom animations for binary variables make sense, at least not for mean.
@giorgi-ghviniashvili can you work on the color issues above for next meeting?
@jhofman fixed color issue:
https://user-images.githubusercontent.com/6615532/157484711-a7be79b9-9d74-49ef-99a5-b20b3e29d42e.mov
https://user-images.githubusercontent.com/6615532/157484741-622af92b-d5b9-4543-b213-a87c0cd939e1.mov
https://user-images.githubusercontent.com/6615532/157485191-812c96a7-4cc9-4d10-9efe-a369ebdabfe9.mov
@giorgi-ghviniashvili The binary specs have been updated to remove custom animations meta spec. All binary specs now exclude this meta spec:
the mean animation looks great.
the median animation has a jump after the medians are calculated from 7 to 8 seconds. maybe this is just a problem w/ the spec?
the max animation looks good until the very end when some ghost points appear below the correct points.
Right now
mean
shows points collapsing. Here are suggestions for how other aggregation operations can be animated: https://idl.cs.washington.edu/files/2019-AnimatedAggregates-EuroVis.pdf