Revise whether to blend selection into parameter

kanitw commented 3 years ago

As I start working on documentation, I start questioning whether this is a good idea.

At the conceptual level, selection seems to be a distinct concept from parameter as there are several commands that only work with selections:

transform": [
      {"filter": {"selection": "brush"}}
    ]

"scale": {"domain": {"selection": "brush", "encoding": "x"}}

"selection": {"or": ["alex", {"not": "morgan"}]}

For this reason, it might make sense that we keep selection as a separate thing, but mention that declaring a selection will also produce a parameter that can be referred.

(We should still consider consolidating point selection and changing selection from key-value object to a flat object with names so it's consistent with parameters.)

Also, we should consider if we want selections: or selection: as the key. (Right now we use params:, but if we want to keep selection:, then we should rename params to a singular parameter:.

domoritz commented 3 years ago

There still is a lot of overlap between primitive params and selections (e.g. they can both be bound to inputs). A user might want to start with a simple param and then promote it to a selection. I think if we blend selections and params, we make it much easier for a user to make this transition (the API becomes more fluid and it's easier to adapt a chart with small, atomic changes).

arvind commented 3 years ago

Copying my comments over from Slack:

If I remember correctly, this point was one of the first discussions we'd had when we began this effort in late Spring/early Summer. I think we ended up going with unification because, although parameters and selections have slightly different syntaxes, they have very similar semantics. In particular, while people are now pretty comfortable with using GoG, I wouldn’t say many folks have good conceptual models of how interaction techniques should be constructed outside of event handlers.

As a result, although parameters and selections look pretty different to us, I worry that if we were to separate them, it would introduce hurdles for adoption: two seemingly similar syntactically constructs, both of which produce seemingly similar interactive graphics, and differences only become evident on closer examination. (Analyzing with the Cognitive Dimensions of Notation, separating parameters from selections would introduce a "premature commitment" with a poor consistency/closeness of mapping.)

Instead, by unifying them, we smoothen the abstraction/complexity gradient: users can start with a vanilla parameter and, if it doesn't produce the interaction they want, can accretively "promote" it to a selection. If they're authoring the Vega-Lite spec in an environment that supports JSON schema, the unified approach is also more discoverable/low viscosity. (The unified approach also helps deal with overlaps between the two, e.g., selections bound to input widgets.)

I think we were always unhappy about keeping "selection" around as a keyword alongside "parameter", and were aware that it isn't ideal. But, from what I remember, we did so out of time constraints. So the fact that this is coming up again suggests to me that we should see if we can resolve this tension before the final v5 release.

One idea I have is that it's not clear to me that the above three examples don't make sense with vanilla parameters. In particular, modulo Vega's support, I think it should be possible to drive a scale domain using a radio box or multi-select parameter (the "encoding" property is optional even for selections). For the other two, I think invoking a vanilla parameter simply by name should treat it as a boolean predicate (truthy/falsy based on its value), analogous to JS' native type coercion. And, for this latter point, I think it exhibits some of the same accretive properties: "parameter": "foo" --> "test": "foo > 5"

kanitw commented 3 years ago

If I remember correctly, this point was one of the first discussions we'd had when we began this effort in late Spring/early Summer. I think we ended up going with unification because, although parameters and selections have slightly different syntaxes, they have very similar semantics. In particular, while people are now pretty comfortable with using GoG, I wouldn’t say many folks have good conceptual models of how interaction techniques should be constructed outside of event handlers.

As a result, although parameters and selections look pretty different to us, I worry that if we were to separate them, it would introduce hurdles for adoption: two seemingly similar syntactically constructs, both of which produce seemingly similar interactive graphics, and differences only become evident on closer examination. (Analyzing with the Cognitive Dimensions of Notation, separating parameters from selections would introduce a "premature commitment" with a poor consistency/closeness of mapping.)

At first, I bought this argument and hence why we have the pending PR.

However, as I try to write the docs for the current state of the API (in the pending branch), it doesn't seem like we are really unifying it.

We basically still have selection as an abstraction. But instead, we have demoted it to be a second class construct that can be constructed only via a parameter.

Thinking more about their similarity and differences:

Selection is mainly dealing with data tuples/rows. In the case that there are projections, they still have an associated projected fields. Parameter is, for the most part, a scalar value. Given this difference, I don't think we can really rebrand the "selection" keyword to be similarly sensible for parameter in the following use cases (see comments):

// To filter by a parameter, one need to specify (1) which column in the datum 
// and (2) which operator to compare to the parameter.  (e.g., "datum.x > my_param").  
// In contrast, selection already has inherent notion of containment and thus can be sensibly used like this.) 

transform": [
  {"filter": {"selection": "brush"}}  
]

// Here selection is a collection of data tuples/rows.  
// Thus we can apply projection and derive domain from the sets.  
// One wouldn't plan to use vanilla parameter in this fashion. 
"scale": {"domain": {"selection": "brush", "encoding": "x"}}

// Again, here we are applying the "containment" notion for selection, which doesn't make sense for parameter. 

"selection": {"or": ["alex", {"not": "morgan"}]}

Looking at the codepath in the codebase also suggests that they are two distinct concepts that didn't share anything (besides that we try to put them in parameter).
As we don't support event handling in (vanilla) parameter, we don't really run into the problem of "two seemingly similar syntactically constructs, both of which produce seemingly similar interactive graphics" because only selection can really handle event handling. I think it's quite easy to explain to people that parameters are simple constants or variables bounded to input elements and selection as a way to interact with the marks. Only selection can capture interactivity on the chart. (Parameter can be interactive with binding, but that's only interacting with input elements outside the charts, not the chart itself.)
The only shared part between parameter and selection is that both of them supports binding (and thus can be related to interactivity). Also note the term "can be" in the previous sentence because parameter can be just used as a constant variable that are shared across the spec (without interactivity at all).

It's also unclear to me if the following "accretive" flow would be a common thing esp. considering that selection is mainly dealing with tuples or a projection of tuples while parameter would be mainly a scalar thing:

start with a vanilla parameter and, if it doesn't produce the interaction they want, can accretively "promote" it to a selection.

If they're authoring the Vega-Lite spec in an environment that supports JSON schema, the unified approach is also more discoverable/low viscosity. (The unified approach also helps deal with overlaps between the two, e.g., selections bound to input widgets.)

I think one could argue that by making selection a second class citizen (that are hidden inside parameter), they are in a way less discoverable.

I think it's worth trying to think about how we should write documentation to explain these concepts. Basically, I haven't found a convincing way to explain them as a unified thing and that prompted me to rethink about these differences.

domoritz commented 3 years ago

I think there is value in merging selections and params. Besides what @arvind wrote above, I think there is enough commonality (especially to a user) that the two should not be separate. For example, both are implemented with signals and both can (or will be) readable/writable from the outside. At some point Vega will support selections and then we might have to adjust things here in Vega-Lite as well but until then I think we should move forward with what we have.

Side note. To me, the selection in the code below is just a shortcut for an expression that correctly handles the fact that a selection (which can be more than a single value).

transform": [
      {"filter": {"selection": "brush"}}
    ]

kanitw commented 3 years ago

We have now found a way to cleanly unify them (by making params works in place that selection used to work with boolean coercion).

vega / vega-lite

Revise whether to blend selection into parameter #7149