holoviz / holoviews

With Holoviews, your data visualizes itself.
https://holoviews.org
BSD 3-Clause "New" or "Revised" License
2.69k stars 402 forks source link

Spatial indexing for linked selections #4596

Open jonmmease opened 4 years ago

jonmmease commented 4 years ago

When performing linked selections on large tabular datasets, it would sometimes be useful to maintain spatial indices for subsets of columns in the dataset.

These would make it possible to accelerate multi-dimensional range selections, and 2-dimensional geometric selections.

I'm picturing that we could create these indices on demand when selections are requested. But, what I'm trying to think through is, where could we store/cache them? They would be associated with a specific DataFrame, so it would be nice to somehow store them alongside the DataFrame, or in the Dataset. We might also want to extend the tabular data interface(s) to include spatial selection operations.

philippjfr commented 4 years ago

They would be associated with a specific DataFrame, so it would be nice to somehow store them alongside the DataFrame, or in the Dataset.

We have long kept the Interface types stateless but this kind of data would imo be a solid reason to allow making Interface instances which hold such data.

jonmmease commented 4 years ago

We have long kept the Interface types stateless but this kind of data would imo be a solid reason to allow making Interface instances which hold such data.

Yeah, that would be handy. I also wonder if we'd want to consider moving the cache introduced in https://github.com/holoviz/holoviews/pull/4547 into these Interface instances.

jlstevens commented 4 years ago

The Interface types are supposed to be stateless to make it easier to convert and cast between them. This isn't to say that stateful interfaces can't be supported, but I would prefer to not make them available by default i.e you could use them by specifying datatype explicitly. Alternatively, you could warn when converting in a lossy fashion but that might just be more confusing.

philippjfr commented 4 years ago

If the state is explicitly restricted to extraneous metadata or optimization related, such as spatial indexing would, I wouldn't worry very much about that part being lossy.