Enable 'object constancy' on data refresh by supporting a key function

palantir / plottable

:bar_chart: A library of modular chart components built on D3

http://plottablejs.org/

MIT License

2.97k stars 224 forks source link

Enable 'object constancy' on data refresh by supporting a key function #2512

Open softwords opened 9 years ago

softwords commented 9 years ago

'Object constancy' , as defined in http://bost.ocks.org/mike/constancy/ , requires a 'key' function passed to selection.data() to provide a unique identifier for each datum. This determines what goes in enter exit and update selections.

Could dataset get a key method, and/or enhanced constructor, to specify this key function?

Then, we'd need some way to get to the enter exit and update d3 transitions.

jtlan commented 9 years ago

@softwords, can you explain how you're planning to use this feature? Is there a custom animation you're trying to implement?

softwords commented 9 years ago

@jtlan: here's a quick jsfiddle http://jsfiddle.net/d4gngj2s/2/ that I hope will illustrate.

The data comes from World Cup finals 2006,2010, 2014. For the last 32 teams, we have name, rank, goals for, goals against. The scatter chart plots a point for each team, x= GF, y = GA. The Color of the point encodes the rank. Now - when I change from one year to another , I want a point that represents a team to move to its new position, and transition from its current color to its new color. Also, some teams drop out - these points could (e.g.) transition size=>0 or opacity=>0 before being removed. New teams will come in - these could transition size , opacity in the opposite direction. But to make this work, you need the key function to determine who stays in, who continues, who drops out. There are always 32 points - so if the default key (ie index) is used, there will never be enters or exits after the initial load. In my data, the key function should be: function(d) {return d.name;}

softwords commented 9 years ago

Hello @jtlan

I've investigated this a bit further, may I share some observations:

Drawer._bindSelectionData() is the point where a key function would need to be accessed, in order that enter() and exit() are correctly identified with respect to this key. Since Drawer holds a reference to the Dataset, the key could be accessed from the Dataset metadata; e.g. by the convention that a member named key in the metadata object is a function returning the key.

so you'd have something like

var ds = new Plottable.Dataset(data)
         .metadata({key: function(d,i) { return d.name;}});

and Drawer._bindSelectionData could look for this:

var dataElements;
    if (this._dataset && this._dataset.metadata && this.metadata.key) {
        dataElements = this.selection().data(data, this._dataset.metadata.key);
    }
    else {
        dataElements = this.selection().data(data);
    };

This would allow an animator to apply a transition to those items that are common to the original and new datasets. However, most plots actively prevent this from happening by resetting those elements before animating; e.g. in ScatterPlot:

protected _generateDrawSteps(): Drawers.DrawStep[] {
      var drawSteps: Drawers.DrawStep[] = [];
      if (this._animateOnNextRender()) {
        var resetAttrToProjector = this._generateAttrToProjector();
        resetAttrToProjector["d"] = () => "";
        drawSteps.push({attrToProjector: resetAttrToProjector, animator: this._getAnimator(Plots.Animator.RESET)});
      }

This could be overcome by deriving from the ScatterPlot and overriding this member.

What we still are unable to do then is apply a transition to exit() or enter() , one reason for that is that it seems to me we cannot get these back once we leave Drawer._bindSelectionData. I think it would be more general to pass the dataElements created in Drawer._bindSelectionData to the animator rather than Drawer.selection()

This would allow an animator to apply a predefined exit transition:

selection.exit()
  .transition()
    .ease(this.easingMode())
    .duration(this.stepDuration())
    .attr("opacity", 0)
    .remove();

A "fadein" transition for new items seems more of a problem - while we can get to selection.enter() in the animator, I can't see how to get from there to the collection of empty nodes created against these items:

dataElements.enter()
   .append(this._svgElementName);

so one way around this is to tag them on creation:

dataElements.enter()
   .append(this._svgElementName)
   .attr("enter", "true");

Then the animator is able to retrieve these items to initialise them before the transition starts:

selection
   .filter("[enter='true']")
   .attr("enter", null)
   .attr("opacity", 0);

selection.exit()
   .transition()
     .ease(this.easingMode())
     .duration(this.stepDuration())
     .attr("opacity", 0)
     .remove();

Perhaps a more far-reaching solution would be to allow the plot to define how to initialise incoming elements, and how to transition outgoing elements before removing them. For example, I think it would be very nice to write this:

 var scatter = new Plottable.Plots.Scatter()
     .addDataset(scatterds)
      .x(function (d) { return d.XValue; }, xScale)
      .y(function (d) { return d.YValue; }, yScale)
      .symbol( symbolFcn)
      .attr("fill", function (d, i, dataset) { return d[scope.colorBy]; }, colorScale)
      .size(function (d) { return 16; })
      .attr("opacity",.8);

// define starting values
scatter.enter()
     .size(0)
     .attr("opacity",0)

scatter.exit()
     .size(0)    
     .attr("opacity",0)

I hope these notes provide some useful food for thought. I have updated the fiddle : http://jsfiddle.net/d4gngj2s/3/ using a plottable.js in which I made the changes I've described.

jtlan commented 9 years ago

@softwords, thanks for the detailed explanation!

Probably key() should be a method on Plot, rather than a specialized field in the metadata of Datasets, since other API points that specify a display property based on the data are on the Plot (x(), size(), etc). It probably wouldn't be that difficult to add this API point.

I'm not sure what correct API would be for controlling the enter/exit transitions. While using enter()/exit() the same way they're used in D3 has advantages due to symmetry with the D3 API, there isn't much else in our API like it. @bluong, @aicioara, @ztsai, @crmorford, care to weigh in?

I have updated the fiddle : http://jsfiddle.net/d4gngj2s/3/ using a plottable.js in which I made the changes I've described.

Did you edit plottable.js directly, or are you editing the source Typescript files?

softwords commented 9 years ago

@jtlan, thankyou for your reply. I'm afraid i just hacked the javascript directly, just to try out these ideas. I'd argue that key is an assertion about the data - that the function is guaranteed to return a unique, semantically meaningful value when applied to each datum. It tells us about the structure of the data, not the visual representation of the data. So I think conceptually it is better associated with the Dataset than the Plot. There are practicalities as well: the Dataset may be shared by multiple plots of course. Also, Drawer knows about the Dataset, but not about the Plot, so it could be trickier to get the key into _bindSelectionData if it is an attribute of the calling Plot. key could be implemented as a method on Dataset , rather than as an attribute of Dataset.metadata ; that would be more explicit.

var ds = new Plottable.Dataset(data)
         .key(function(d,i) { return d.name;});

softwords commented 9 years ago

@jtlan : I've updated the fiddle again http://jsfiddle.net/d4gngj2s/4/ This version demonstrates a possible solution for enter() and exit() methods to define a plot's transitions. Once again I've just done this by editing the plottable.js javascript directly.

The basic idea i have used is to extend the DrawStep object by including two new properties: enter and exit. Each of these is an AttributeToProjector, and they define the setup of incoming objects and the transition on exiting objects respectively. Drawer._drawStep manages applying the appropriate projectors to the enter, update and exit selections.

  if (step.enter) {
       selection
             .filter(".enter")
             .classed("enter", false)
              .attr(step.enter);
  }
  //// Modified - assume the DrawStep may have members enter: exit
 if (step.exit) {
    step.animator.animate(selection.exit(), step.exit)
       .remove();
 }
 else {
      selection.exit().remove();
 }
 step.animator.animate(selection, step.attrToAppliedProjector);

Now, the Plot itself knows how to create the AttributeToProjector it needs from its collection of attribute and property bindings; so I have simply implemented enter() and exit() methods on the plot ( specifically on the Scatter plot) to return a new instance of that same plot. Then _generateDrawSteps can add these properties to the DrawStep if they are needed:

    var ds = { attrToProjector: this._generateAttrToProjector(), animator: this._getAnimator(Plots.Animator.MAIN) };
        if (this.enterattrs) {
            ds.enter = this.enter()._generateAttrToProjector();
        }
        if (this.exitattrs) {
            ds.exit = this.exit()._generateAttrToProjector();
        }
       drawSteps.push(ds);
       return drawSteps;

With these changes, this code works as you would hope, objects can shrink, swell, fade in fade out, fly in, fly out, and even change color , with these transitions respecting the object constancy as defined by the key function on the Dataset:

    plot
        .x(function(d) { return d.GF; }, xScale)
        .y(function(d) { return d.GA; }, yScale)
        .attr("title", function (d) { return d.name })
        .size(12)
        .attr("fill", function(d) { return rankcolor(d.R); });

    plot.enter()
        .x(function(d) { return d.GF; }, xScale)
        .y(function(d) { return 0; }, yScale)
        .size(30)
        .attr("opacity",0);

    plot.exit()
        .attr('fill','pink')
        .x(function(d) { return 0; }, xScale)
        .y(function(d) { return d.GA; }, yScale)
        .size(3);

softwords commented 9 years ago

@jtlan ( and @bluong, @aicioara, @ztsai, @crmorford !)

I hope you can find a moment to look at another proof-of-concept fiddle: http://jsfiddle.net/oaskqta0/5/

This demonstrates object constancy in a bar chart; it's essentially equivalent to the functionality in http://bost.ocks.org/mike/constancy/ The transitions are specified with enter() and exit() methods on the plot.

If this functionality were implemented in plottable, I think the current _animateOnNextRender logic could be deprecated. To me, it seems what that does where it is implemented is the same as treating every datum as though it were in enter(). You can force every point to be in enter() by using a key function that will always return a new value, e.g.

var counter = 0;
var noJoinKey = function() { return counter++;};

...
var ds = new Plottable.Dataset(data)
   .key(noJoinKey);

You can see this behaviour in the fiddle by clicking No constancy Every bar exits, and new bars enter.

Join on index shows the default join strategy based on the index into the data array. With this option bars change shape and colour, but represent different countries each time. There is no enter or exit (because the data sets are all the same size) - every bar updates.

jtlan commented 9 years ago

Thanks for this; it looks pretty amazing! Probably the default key function should be "no constancy" to preserve the existing behavior.

Since it looks like you've created a working implementation, would you be willing to send us a pull request? We're happy to review your code, test, and help you navigate the source files.

softwords commented 9 years ago

@jtlan - thanks - I'm happy to be involved (obviously!) but, so far I have only hacked the javascript to get these samples working. Typescript is pretty new to me, so I've been setting up an environment over the weekend and getting familiar with your source, and reading through the history to try to better understand how it evolved to where it is today. Then I'd propose I add one pull request which would include:

implement the key method to Dataset; and with that in place, and make it default to a 'no constancy' key.
remove the 'animateOnNextRender' logic, and verify that with 'no constancy' Bar Plot is animated on each render.

With this, if a genuine key function is used, you will see some meaningful animation occur on data changes.

Thinking more broadly then, to get the animations like in my fiddle, we need to be able to access the enter() and exit() selections after the data() join takes place, so as to be able to work with a 'general update pattern'. In fact, I think a good solution would be to adopt the lifecycle model developed in d3.chart and expose insert enter update exit and merge. This will involve changes to Drawer, and I can see there has been a lot of to-and-fro in that class over time, so I would need to call on your team for guidance on that.

I'm also looking at ( in typescript now!) a way to get 'staged' animations; for example, to have a bar chart animate by first sizing each bar to its new height then moving to its new x position. There's some motivation for this here. The approach I am working on is a new Animator Framed, subclassed from Easing. This animator applies the collection of attributes it is passed in a series of transitions, or frames. You specify the relative duration of each frame, and which attribute(s) to transition in that frame. (The last frame will always transition everything).

So for example when you write


var frames = new Plottable.Animators.Framed()
     .duration(2000)   
     .frames([1,3])
     .frameActions[['y', 'height']])

the 2 seconds is divided into two transitions of .5 and 1.5 seconds. In the first transition, the y and height attributes are transitioned only (causing the bar to resize). Everything else is transitioned in the second transition - causing the bar to move to its new x.

This small variation:


var frames = new Plottable.Animators.Framed()
      .duration(2000)   
     .frames([10,2,30])
     .frameActions[['y', 'height']])

introduces a pause between the two movements, because there is no action for the second frame.

One thing I think is nice about this is the symmetry with the array-of-arrays paradigm you have in Table.

Taking this further, by applying Framed transitions to exit() enter() and update() as well as to the parent plot ( effectively, this is merge in the lifecycle model), you can make quite sophisticated transitions declaratively - e.g. exit points disappear first, then bars resize, then bars move, then entering points grow to fill the vacated points.

jtlan commented 9 years ago

@softwords, we haven't heard from you in a while. Would you like some help with the implementation?

Your proposal seems solid overall, although I have some questions about the animator design. Does the array of frame actions include only attribute names? Also, instead of using a 2-D array of attributes, what if we made each Animator correspond to only a single frame, then used some other structure to combine them? This would let us build on the code in Animators.Easing.

Regarding enter() and exit(): It seems odd to implement this by having a Plot contain other instances of its class (since things like plot.enter().enter().enter() would be valid, although I don't know why you would do this...). However, I can't really think of a cleaner implementation that would also preserve all the property-setting methods on the Plot.

@aicioara, @bluong, @ztsai, can you weigh in?

aicioara commented 9 years ago

I've been watching this thread for a while and was a bit intimidated by the complexity of the solution. Let's maybe try to take one step back and start simplifying it. Let's see what we are trying to build and why.

To be honest, I only understood 75% of what is going on over here and I can't truly weight in unless I reach 100%.

Hence, let's cut the problem down to the essence and strip it of unnecessary details. What do we try to achieve and why. How many users would benefit from this. I can see this information scattered around a 10 minute reading material which is not really good if we want more people to weight in.

However, from what I see (and please excuse me if I am wrong), we are trying to achieve a very small feature (one particular animation) by extending our API surface by a huge amount. Also, we are are borrowing a lot of ideas from d3, which is not in the spirit of Plottable. Plottable's first goal was to simplify d3 and make plot creation easier. I find .enter() and .exit() particularly odd.

There are two ways to fix the above.

Iterate fast. You open a PR with a POC, we can look on it and better understand what is the scale of the change and whether we want it or not
Keep talking. Architect it in such a way it has as little new API points as possible. Leverage what we have and step away from d3

I hope it helps. I am really looking forward to this discussion.

TL;DR: Let's engineer this issue better so we understand what we are building, why, how long would it take to implement and who would benefit from it.

softwords commented 9 years ago

@jtlan @aicioara; thanks for your comments, sorry for the silence - I've been pulled away from thinking further about this these last 2 weeks. Some motivation: like a lot of people, I found myself bouncing from one d3-based charting package to another as I wanted some particular feature, so when I found a youtube talk "Why are there so many javascript charting libraries?" it struck a chord, and I was sold on the idea of investing the effort to get proficient with plottable. But I think the major thing missing from plottable right now is support for object constancy; that is, that a screen item ( bar, dot, whatever) is linked to some entity in the data, and when the data changes, the animation moves the screen items so that they are still associated with the same entity in the data. This means that the animations you implement can add meaning to the data (watching something move on the screen tells you how the attributes of that entity have changed between the two data sets). Currently, I feel that the animations are really just providing color-and-movement.

The fundamental thing you have to have to get object constancy is a key function supplied to selection.data(); this would want to be attached to the dataSet. The key will determine what goes in the d3 enter() and exit() selections. Without a key, items are just matched by index. My little world cup fiddle illustrate this: there is 32 data points in each set - but they are not the same 32 items ( ie the countries change). Without a key function to describe this, there are no exit() or enter() selections. So: adding key() to dataSet - and using it in Drawer._bindSelectionData - is the first step.

But then, once you have a way to specify the object constancy, you need some way for the animation to work with that. The key problem here is that you can't get at the d3 enter-update-exit pattern; because, for one thing, the exit() selection is remove()d as soon as the data is bound to the selection. So there is currently no way to make exiting elements fade out, or shrink or whatever to indicate to the viewer they are exiting. Similarly there is no way to make the enter()ing data behave differently to the continuing elements, because the enter() selection is not remembered.

So I think what is needed is: a) some mechanism where plottable can apply transitions to each of the enter()-update()-exit() selections; and b) some simple way to express what you want to do to those.

Finally, if we can get a key, and some way to act on enter() and exit() and update() , then the last frontier of animation ( see my link before to the user-experience motivation for this) is to stage the animation. That is, to have items move then resize in separate transitions; and also to have the exit() enter() and update() transitions happen in sequence if desired rather than all at once.

So - that's what I'd see as the 'feature request', while trying to step back (this time!) about how to do it!

@jtlan I agree that the implementation of enter() and exit() ( and potentially head-spinning nestings!) was more about expediency than architecture.

softwords commented 9 years ago

Further thinking on this I think the best way to manage it is within an Animator, and avoid changing anything in the plots themselves. The interface between Drawer and the Animator would need to change to give the animator access to the enter() and exit() selections created by the databinding.
ie instead of passing the selection to animator.animate , Drawer constructs some object { enter: , exit: , update: } as passes this to animator.animate . The animator can act only on update:, which is what currently happens, or can act on the enter and exit selections as well if it wants to. If an animator derives a transition from any of these 3 selections, it must responsible for replacing the selection with the derived transition in the passed-in object. Then, any animator that comes after will act on the transition, and so may build up a chain of transitions.