flekschas / regl-scatterplot

Scalable WebGL-based scatter plot library build with Regl
https://flekschas.github.io/regl-scatterplot/
MIT License
191 stars 24 forks source link

Feature request: sort values #162

Open oskbor opened 10 months ago

oskbor commented 10 months ago

Hi! Currently regl-scatterplot renders the items in the order they appear in the arrays. This means that outliers (high or low values) can be obscured by other dots on the plot. My request is to have an option to draw the largest (or smallest) value on top.

Our current workaround for this is to sort the arrays before passing them to regl-scatterplot. This works but its clunky and we also need to "unsort" the values in the selection/hover events as the index is the source of truth.

My hope/idea with this request is that maybe regl-scatterplot can achieve the desired behaviour w/o sorting?

Idea 1 would be in line with most other settings

sortBy?: "valueA" | "valueB"
sortDirection?: "ascending" | "descending"

Idea 2 is to pass an array with the wanted index ordering

sortBy?: number[]

best regards Oskar

flekschas commented 10 months ago

I'm not planning to add support for this as I believe this should be handled by the wrapper application. I.e., just sort the data prior to drawing it with regl-scatterplot. The performance will be the same but you will have all the flexibility in the world to sort the data in any way you like (not just by valueA and valueB).

We also need to "unsort" the values in the selection/hover events as the index is the source of truth.

Either way there needs to be a sort index. Either you handle it or regl-scatterplot does. I currently don't see a performance/memory difference.

tl/dr: I thought about adding such a feature a long time ago but realized there are too many possible ways to sort data. E.g., what if someone wants to randomize the draw order? Or if you want both: render low and high values first and average values last? Or as in the performance mode example, order points by quadrant/bin for better rendering performance? I realized that it's hard to anticipate all use cases upfront and it instead is easier to let the wrapper application decide. This also allows optimizing the data upfront in a static fashion if desired, which is not possible with any builtin solution. Of course, regl-scatterplot could offer an interface for flexible sorting similar to JS's builtin sort() function. But then regl-scatterplot is just re-implementing what JS already offers.

The only feature that might be nice to have is allowing the wrapper application to pass the sorted indices to regl-scatterplot at your own risk. E.g.:

point index | draw order
0           | 3
1           | 2
2           | 1
3           | 0

With "at your own risk" I mean: you could still pass [0, 0, 0, 0] as the draw order and then regl-scatterplot would only draw one point four times.

oskbor commented 6 months ago

The only feature that might be nice to have is allowing the wrapper application to pass the sorted indices to regl-scatterplot at your own risk.

This would simplify things a lot for us! We then wouldn't have to sort several arrays (x,y, size, color) just one. We would also not have to keep an index map around for the selection events, which would simplify things further as well 👍