Open cpsievert opened 7 years ago
unique
? Hopefully you're not using keys that aren't unique? :)
Do you need this because it's tricky to specify the key as a group, and that's a thing you want to do? Or is it that you want a group default and are reaching for key because it's there? I'm less sympathetic to the latter than the former--I don't think it's unreasonable to ask you to specify how you want these things to be grouped.
Hopefully you're not using keys that aren't unique?
There are certainly useful examples with non-unique keys -- https://gist.github.com/cpsievert/b873740854cbfcc7a6c7ce891447c3fb
I want it by default because that's what I would want to do 80% of the time. And yes, it's not immediately obvious that group
should be a function of sharedData$key()
That non-unique key example makes no sense to me. We use the key to uniquely identify rows. How does linked brushing work if you can't uniquely identify rows?? I think we have a fundamentally different understanding of what a key is, and in my understanding, not only do they need to be unique but it's far less than 80% of the time that you'd want them to be the filter group. That's analogous to saying you usually want a UI widget to filter a database table by a primary key.
I'm getting really buggy behavior from that example so maybe I have something installed wrong (I'll leave a comment on the gist).
But what I am able to see is that making a lasso selection on the plot on the right highlights some of the values on the left. If I don't have anything selected in the filter_select, then it's clearly not the right points (or right number of points) being selected. However, once I've changed SharedData$new(m, ~variable)
to SharedData$new(m)
, then it selects the right points.
Sorry, I made that example to address this question, not to demonstrate linked brushing (please see my response in the gist)
We use the key to uniquely identify rows. How does linked brushing work if you can't uniquely identify rows?
I think when you say linked brushing, you imply 1-to-1 linking in a scatterplot matrix. I think of it much more generally. Here is another example (of 1-to-n linking):
library(plotly)
library(crosstalk)
tx <- SharedData$new(txhousing, ~city)
p1 <- ggplot(tx, aes(date, median, group = city)) + geom_line()
p2 <- plot_ly(tx, x = ~median, color = I("black")) %>%
add_histogram(histnorm = "probability density")
subplot(p1, p2) %>%
layout(barmode = "overlay") %>%
highlight(
"plotly_click", dynamic = TRUE, persistent = TRUE,
selected = attrs_selected(opacity = 0.3)
)
This is the sort of example that I'd want filter_select()
to inherit it's group definition from sharedData$key()
That's analogous to saying you usually want a UI widget to filter a database table by a primary key.
Exactly. This is how many of my mentors would think of a linked brushing framework. From (Cook et. al. 1991):
I think we are talking past each other because of a terminology problem. What does "key", and/or "primary key", mean to you? You seem to have defined it as "default grouping" and I'm defining it as "individual row locator". Is that accurate? If so, I'm not arguing right now about whether the concept of a default group is useful; but I am arguing that the property that I intended as an individual row locator, is most certainly not the right place to express that.
And to be clear, I'm also not arguing that 1-to-1 linking is the only useful form of linked brushing. However, what is true today is that Crosstalk is designed for 1-to-1 linking. That being said, I spent a lot of time during the development of Crosstalk thinking about how to support the more general n-to-n linking, and we should probably spend some time talking about it. To start with, I strongly believe you still fundamentally need unique keys; you just need to have potentially multiple keys assigned to each visual object, and depending on the visualization, you might also need to know how to compute partial selection (e.g. your visual object represents keys [A, B, C], and only [A, C] are selected).
Let's call "key" the "unit of interaction" -- defined when you initialize a SharedData
object (i.e., the key
argument). I had always taken the perspective that the definition does not have to be a unique key, and now I'm surprised you don't provide a check for that -- please don't decide to throw an error in that case!
I strongly believe you still fundamentally need unique keys
Why? How would that extend to a situation where you have x/y data for a polygon representing one "unit of interaction"?
you just need to have potentially multiple keys assigned to each visual object
I already have a notion of this built on top of non-unique keys. Please see this slide-deck http://cpsievert.github.io/talks/20161212b/#20
unique(sharedData$key())
seems like a reasonable thing to assume, if not provided