APIs to interact with Selection

kanitw commented 7 years ago

This question in the vega-js group is very relevant to this issue.

domoritz commented 7 years ago

@arvind Can you post an example of how one might access a selection? I'm trying to build an application where you can crossfilter between different visualization. Since I want to re-query the data, I cannot use a single spec for the different charts.

arvind commented 7 years ago

Sure. The selection states are stored in datasets named selectionName_store (e.g., if you had a selection named brush, the dataset would be named brush_store). Accessing the dataset via the view api (view.data('brush_store')) gives you the constituent queries for each of the selection instances (i.e., no resolution will be performed). In the case of point (single/multi) selections, this will be an array of values; for interval selections it will be the data extents. You can similarly set the selection state via the view API provided the tuples you insert follow the same structure. Note: for interval selections, inserting new selection instances via the view API may not always correctly update the brush mark state.

domoritz commented 7 years ago

Note: for interval selections, inserting new selection instances via the view API may not always correctly update the brush mark state.

Why is that? Is there a way to correct the brush?

arvind commented 7 years ago

The brush mark is currently driven by signals within each unit. It's difficult to update these signals based on updates to the backing dataset (we need to identify the matching tuple, and extract information from it). @jheer and I decided that this would be a limitation with 2.0 that we would address in subsequent releases once we better understood how users wanted to update selections via the API.

mstone commented 6 years ago

/cc @djahandarie

arvind commented 6 years ago

To make forward progress on this, I would like to decouple an API for writing to selections (which involves hairier design/implementation issues) from reading selections (which should hopefully be more straightforward). Here're some ideas sketching out the latter.

New API Methods

vl.selection(view, selectionName) -- returns an array of tuples that define a selection's predicate, respecting any resolution rules. For example, [{Origin: 'Japan', Year: 1981}, {Origin: 'USA', Year: 1982}] for a multi selection or [{Horsepower: [40, 150], Miles_per_Gallon: [40, 15]}] for an interval selection.
vl.addSelectionListener(view, selectionName, handler) and view.removeSelectionListener(view, selectionName, handler).

Notes

Selection tuples are stored in datasets named selectionName_store, and the logic for evaluating these tuples as a predicate is encapsulated within Vega expression functions.
The vlPointDomain and vlIntervalDomain functions resolve the tuples, producing a list of selected values for a specific field. Reading a selection should invoke a more general version of these functions (e.g., vlPointValues and vlIntervalValues) that resolves all selected fields in a single pass, rather than one pass per field.
The simplest solution would be for each selection to add a top-level signal that calls the appropriate Values method. However, if selections are never read from externally, this incurs a performance penalty of re-evaluating selection tuples on every interaction event.
Alternatively, the vl.selection method could invoke these Vega expression functions directly. This strategy would keep selection logic encapsulated within expression functions, and would not incur the cost of needlessly resolving selection tuples on every interaction event. To do so, however, we need the following:
- Parse a Vega expression outside a specification to generate a Function that can be invoked. The vl.selection could memoize this step by storing the generated Function on the view (e.g., view._vlPointValuesAST).
- Selection expression functions register tuplesRef and indataRef on their scopes. Automatically registering these refs when an expression function is parsed externally seems problematic. An alternate solution might instead make it possible to explicitly declare needed refs as part of the specification?
- view.addDataListener and view.removeDataListener functions that the Vega-Lite selection listener functions would map to.

I lean towards exposing selections as signals for both being the simplest, most idiomatic solution that does not require modifications to Vega internals beyond the expression functions. Moreover, these new top-level signals could also offer a cleaner entry point for a future "selection write" API (e.g., writing to these signals would update the backing dataset and any downstream signals within views).

/cc @jheer, @kanitw, @domoritz

domoritz commented 6 years ago

This is great. I think having top level signals makes sense especially if we can use them to write. I wonder whether we even need the helper functions in that case or whether the Vega view API is sufficient.

arvind commented 6 years ago

Yeah, I went back and forth on adding Vega-Lite helper functions. I lean towards adding them (rather than relying on the Vega view API alone) to give users a forward-compatible way to access selections agnostic to the Vega we generate. Thus, we would be free to change the underlying mechanisms of how selections could work in the future.

An interesting question is whether we are protected from all of this with semantic versioning. If we point users to Vega view APIs to access selections, then we're implicitly extending the semantic versioning contract to the Vega we generate. This has advantages (e.g., Lyra would certainly appreciate being able to rely on this definition of semantic versioning, as it analyzes the generated Vega). But, I'm not sure how feasible this would actually, be or how we would define what major/minor changes in Vega-Lite -> Vega generation would be...

domoritz commented 6 years ago

I don't know how forward compatible we need to be and I don't see us changing how selections are implemented anytime soon. Thus, I lean towards not providing helper functions.

Every Vega-Lite version already has a minimum Vega version it depends on. We can make a promise about the specific signals while still being flexible about how we generate other parts of the spec.

kanitw commented 6 years ago

I think we should provide helper functions because it's not a realistic expectation that Vega-Lite users should know how we name the underlying data sources and signals.

Plus, there is no "signal" concept in Vega-Lite, asking users to use signal APIs (which is a lower level abstraction) is a bit weird.

domoritz commented 6 years ago

Let's see how we name the signals. I'm expecting the signal names to directly correspond to the selection names.

simon-lang commented 5 years ago

Sorry if this isn't the right place for this question, but does this mean if I'm using Vega-Embed and I have a selection in my spec like this:

selection: {
    brush: {
        encodings: ['x'],
        type: 'interval'
    }
}

Then the correct way to access that selection is like this?

view.addDataListener('brush_store', function (name, value) {
    console.log(value[0].intervals[0].extent)
})

This is working for me, but seems a bit verbose? Is there a simpler API for this now?

domoritz commented 5 years ago

@simon-lang We are working on improving this, which we will release in Vega-Lite 3. See https://github.com/vega/vega-lite/pull/4068 for details.

domoritz commented 5 years ago

@arvind we can close this issue once we have documented the new API, right?

arvind commented 5 years ago

Thanks for checking in @simon-lang. As @domoritz mentioned, we should have good news to share on this front soon :)

@domoritz, we're tracking selection API documentation in #2790 so this issue should be safe to close.

simon-lang commented 5 years ago

Thanks @domoritz & @arvind . I just discovered Vega recently and I'm absolutely loving it. Keep up the great work!

kanitw commented 5 years ago

Just got a chance to follow up on this.

The refactor in #4068 is very helpful for interacting with selection data.

However, whether we (1) have a thin wrapper around the signal APIs for the selection APIs or (2) ask users to directly use the signal APIs is still an open question?

I still slightly prefer (1) as it is weird if we only provide abstraction only at the syntax level, but not at the API level. That said, I'm happy to hear the reasoning for (2) too.

kanitw commented 5 years ago

Another question. The signal refactor definitely makes reading the selection quite straightforward.

However, I remember @arvind mentioned that setting the selection is still tricky. If so, should we start a new issue to discuss for the setting part? The trickiest part is probably how to design with faceted data -- but I start wondering that a combination of group name and key should be sufficient to identify different subplots in the scenegraph?

kanitw commented 5 years ago

Looking closer at selection codebase, currently the unit in the selection data store is generated using the unitName() method.

For faceted plots, which can contain multiple units of the same name, we currently append the unit name with values from row- and column-fields (delimited by _). For example, for a plot that facets by cylinder, a unit name is child_6.

However, this unitName() method has two issues:

1) It only includes the row/column values of the nearest ancestor that is a facet. For nested facet, we can still generate redundant unit names (as the upper facet's value won't be considered). -- This is currently not very critical as we hide our support for nested facet, as we still have to deal with other nested facet issue such as https://github.com/vega/vega-lite/issues/2761 (and thus, we still hide nested facet from the official schema).

2) More importantly, To set the selection by writing the datasets via the view data APIs would be tricky for users. Basically, it is a bit tricky to provide the right unit name as we have quite an arbitrary format (row value first, and then the column value). Once we support nested facet, this will get even more hairy.

To resolve 2), we could consider splitting key aspects from unit. For example, from {unit: "child_6"} where 6 is the Cylinders value could become {unit: 'child', key: {Cylinders: 6}}. (Or some similar design.) However, we use the unit as the lookup key for selection (e.g,. with resolve: "global"). So splitting this would make the comparison inefficient. Thus, this won't really work.

Instead, we may need to explain this complicated rule in the docs for interacting with selection. However, I think this unit name rules is too complicated. Thus, it is better to provide a selection set/write API that converts the input with a more user-friendly format (e.g.,{unit/name: 'child', key: {Cylinders: 6}}) and convert this into the internal unit key. If we use this as input, then I think we should use the same format for the output of read API.

kanitw commented 5 years ago

Note that I think the format {Cylinders: 6} should be reasonable as nested / crossed facet shouldn't use the same field in multiple row/column channels. Even if users do, many of the cells will be empty and this still won't generate cells with redundant unit names.

kanitw commented 5 years ago

Oh, but the other tricky part is repeat like @arvind originally suspected. Basically, we populate subplot's unit name by appending the repeated variable name to the original unit name. So whatever structure we are providing to support facet above, should support repeat use case as well.

To kick start the conversation, I think we should split repeat from facet as repeat deals with field name while facet deals with field value. Perhaps we could do: {unit: 'child', repeat: {column: "field_name"}, facet: {Cylinders: 6}}. For nested repeat, the key becomes the "as" of the repeater as proposed in https://github.com/vega/vega-lite/issues/2767.

(Btw, having this long thread makes me feel like we should adopt RFCs repo like React. -- It might not be that much more work, but will provide a nice way to iteratively improve proposal.)

dileepyelleti commented 5 years ago

Hi, Is this feature available in latest release? If not when can we expect it. What I am trying to do is setting selection based on some external event. Please let me know if there is any way to do so.

domoritz commented 5 years ago

Currently, you can read but not write selections.

bearzx commented 5 years ago

Hello, What is the current status of this issue? I tested with @simon-lang's addDataListener approach and that worked. But I wonder if there is a more elegant way than filtering my dataset based on the selection coordinates.

domoritz commented 5 years ago

@bearzx See my comment from two weeks ago. We hope to have a write API for Vega-Lite 4 but no promises.

bearzx commented 5 years ago

I meant what is the API for reading? Do I still need to go through the view API to set a DataListener to read the selection information like @simon-lang mentioned?

bearzx commented 5 years ago

Ok I guess I get it now after re-reading the documents of View APIs. I suppose I still need to know the "internal" variable naming convention: for a one time read, the best way is view.data("brush_store") assuming I have a selection named brush.

domoritz commented 5 years ago

Yep, for now you go through the view API. @arvind implemented that the names are consistent with the names of the selections.

kanitw commented 5 years ago

FWIW, this discussion/confusion makes it clear that we should try to include this in 4.0 if we have time to do so.

domoritz commented 5 years ago

Well, we have no documentation so I'm not (yet) sure whether we need a second API.

bearzx commented 5 years ago

If it takes too much effort to carefully design the API, I think it's fine for me as a user to have the "hack" for now, since the code is pretty straight-forward (the only catch is the need to know that x_store naming). It would be great if the hack can be clearly documented somewhere so other ppl don't need to mine this thread to find out.

kanitw commented 5 years ago

Well, we have no documentation so I'm not (yet) sure whether we need a second API.

The problem right now is we have a leak abstraction where the API doesn't match the level of abstraction in the syntax.

I imagine the API for Vega-Lite will be a very thin wrapper, but at least present things at the same abstraction level as in Vega-Lite.

chris-canipe commented 5 years ago

I need the ability to write selections as well (in order to clear them).

I have a multitude of graphs that crossfilter each other and I'm not using Vega Lite's filter/transform because there's a massive amount of data and I have a zippy API that handles the filtering and aggregation.

When a user makes a selection, I intercept it via addSignalListener and add it to a component that manages selections outside of Vega Lite. This is done to centralize selections for the API calls, but more importantly to help the user: there are a multitude of graphs and one can quickly lose track of what they've selected. Therefore, I show all of the selections in one location and give the user the ability to remove them.

And this removal is the issue. If the user interacts with a graph to clear a selection, everything works as expected because it follows the aforementioned flow (Vega Lite -> External). However, if the user removes a selection via the external component, I have no way of telling a graph to clear its selection (External -> Vega Lite).

fredhohman commented 5 years ago

I have a UI that contains multiple Vega-Lite charts, each with their own selections. It would great to have a button that clears all selection states.

@kanitw suggested current workarounds are (1) set the selection store data to be empty (need to know about Vega _store, etc.) or (2) re-render the plot.

Just wanted to add to the conversation that "clear selection" would be useful to have in a selection API 🙂

slopedog commented 4 years ago

This is hacky, but I'm setting my legend-bound selection by using SVG as my renderer, doing an XPath search of the text element in the legend, and clicking it:

function namespaceResolver(prefix) {
    if (prefix === 'svg') {
        return 'http://www.w3.org/2000/svg';
    }
    return 'http://www.w3.org/1999/xhtml';
}

let textValue = "findMe";
let el = document.getElementById("vega-chart");
let xpath = document.evaluate(`//svg:text[.='${textValue}']`,
                              el, namespaceResolver);
let legendEl = xpath.iterateNext();
let event = document.createEvent('Events');
event.initEvent('click', true, false);
legendEl.dispatchEvent(event);

tgwhite commented 4 years ago

Just adding my two cents.

Vega-lite almost has full functionality in terms of interactivity. Cleaning up the selections API would fix that.

For instance, one might not want to use shift + click or alt + click to generate multi selections. A click is more natural in most cases. By exposing selection getters / setters, one could listen for a click and then add or remove additional selections using this API, without using a shift/alt keypress.

Further, people would be able to programmatically trigger selections without direct user interaction. This would allow a chart to live in a stateful environment, where a user can come back to a page and some default selection is fired in response to something else. Really, the possibilities are endless!

People keep mentioning hacks that are available but I'm new to the Vega library. It would be nice if someone could post some examples that show how to hack the selection state until an API is available.

From what little I know right now, there seem to be potential hacks for mimicking selection state by "manually" altering view aesthetics, or using the view.insert()/view.remove() methods to alter data in response to some event listener.

keckelt commented 4 years ago

Hack & ~Slay~ Select:

Toggle Selections

For the first thing you mentioned in your post: To toggle selection with clicks instead of shift/ctrl click, you can set the toggle to true: Open the Chart in the Vega Editor

Set an interval

To set a brush/interval, you fire a signal with the interval in screen coordinates (actually coordinates inside the view). I figured that out in the signal recorder:

Open the Chart in the Vega Editor

So if you want to the brush programmatically, first get the scale axis' scale via the view API and use it to get from the data domain to the screen domain. You find an example in this Codepen 💻👉📉

This approach also works for 2d brushes, e.g. ,in a scatter plot.

Set a clicky selection

I haven't tried since I started using Vega lite and might be more successful now, but back then i didn't manage. To my knowledge, you have to know the vega internal ids that were assigned to the data to be able to fire signal containing "select id xy".

For simple bar charts without aggregation i assume it would be rather straight forward. With data aggregation coming into play this gets more challenging to find the id in the first place.

Again the signal recorder may help (here clicking on japan):

Open the Chart in the Vega Editor

mdashti commented 3 years ago

Here's a hack using the current version of VegaLite (vega@5.17.0, vega-lite@4.17.0, vega-embed@6.12.2), assuming that you have a brush selection in vegalite_spec_for_minimap:

vegaEmbed(`#minimap`, vegalite_spec_for_minimap)
    .then(function(minimapVe){
        minimapVe.view.addSignalListener('brush', function(signalName, e) {
            console.log("updated");
        });
    });

domoritz commented 3 years ago

No, you should use the Vega view view api to listen for signal changes instead. https://vega.github.io/vega/docs/api/view/#signals

mdashti commented 3 years ago

Thanks @domoritz. That's really helpful. Updated my above code.

domoritz commented 3 years ago

Nice. That looks right. If you can use async/await, the code even gets a bit more readable.

aantn commented 2 years ago

Is it possible to write to the selection from Javascript right now? I can always update all my data points and modify one of their fields, but I was hoping for a cleaner way to do this with the selection API?

domoritz commented 2 years ago

Yes, you can set the selection with the signal and data API. However, you have to reverse engineer the right format.

john-guerra commented 2 years ago

After much research I made this example of how to change the vega-lite brush programmatically

https://observablehq.com/@john-guerra/update-vega-lite-brush-programmatically

Using @koaning example this stack overflow question I figured that you can change the brush by updating "brush_y" (assuming that your selection is called brush) or change the selection using "brush_tuple" (which doesn't seem to update the brush mark)

Am I doing this right? cc @domoritz @arvind @kanitw

marr commented 1 year ago

I am running into this as well. @john-guerra I forked your notebook here https://observablehq.com/@dmarr/update-vega-lite-brush-programmatically with a fix. Seems the way you were loading the vega-dataset no longer works on observable.

My issue is that I have an external control (vue) that I need to bind with the brush value. So I can listen to brush signals thusly:

view.addSignalListener(
  'brush_yearmonthdate_timestamp',
  debounce((_, range) => {
    emit('change', range);
  }, 10)
);

and outside of the vega component I can respond to the change to update my vue control.

However, if I want to set the brush due to user changing the vue input, I am having trouble:

// start and end are external values from a date range selector
const range = [start, end].map(time => view.scale('x')(time));
const currentRange = view.signal('brush_x');
if (range[0] !== currentRange[0] || range[1] !== currentRange[1]) {
  view.signal('brush_x', range).runAsync();
}

That runs when my vega component receives new prop values (from external change). So basically the component gets a new prop on signal change and re-runs its internal listener. Seems like I should be able to change the brush value without firing the signal listener, but maybe I'm missing another option.

Any help would be appreciated. Thanks!

drlynb commented 1 year ago

So after reading all of this i still cannot manage to get the array of selected objects using the Vega-Lite API. I very simply want to aggregate a set of selected objects from a multi-click event and test for inclusion in this set as a coloring condition. Because i am working at different levels of detail, I cannot get the built-in toggle functionality to work properly. And if I cannot access the array directly I have no way of fixing that behaviour. In 2017 @arvind suggested we could access these data using view.data('selectionName_store'.) But this simply returns an empty set for me. AS I am teaching many students who are NOT deep divers into the underlying code, I am hoping for some straightforward approaches that allow querying of the data structure. My problem code can be found in the Observable notebook : https://observablehq.com/d/2f6b14fd93d3f429

My apologies if I am making stupid errors but this is very frustrating.

vega / vega-lite