Closed TheStanian closed 5 months ago
This is a great idea and certainly a feature I'd be happy to add. Technically I think there are a few ways to achieve this and they might all be useful to have on their own:
publicDraw()
. This would be useful on its own when one wants to avoid the main thread having to handle state data at all.publicDraw()
(this needs updating of kdush, which I already did in https://github.com/flekschas/regl-scatterplot/pull/143). This would be useful on its own because for static data one could pre-compute the index.publicDraw()
to accept partial point data (i.e., only some components) and then have that function skip unnecessary updates. Internally we can, for instance, update the state texture using regl's buffer.subdata()
. The biggest hurdle is that the draw function relies on row-major data (see toArrayOrientedPoints()
). Refactoring this would generally be good.What about the following non-breaking API changes to scatterplot.draw()
?
type Draw = (
data:
| number[][]
| Partial<{ x: number[] | Float32Array, y: number: [] | Float32Array, ... }>
| Partial<DrawData>,
options: Partial<{
showPointConnectionsOnce: boolean,
transition: boolean,
transitionDuration: number,
transitionEasing: string,
preventFilterReset: boolean,
hover: number,
select: number[],
filter: number[],
zDataType: 'continuous' | 'categorical',
wDataType: 'continuous' | 'categorical',
// NEW
kdbushIndex: ArrayBuffer
}>
) => Promise<void>;
interface DrawData {
points: Partial<Points> | Float32Array,
connections: number[][],
}
interface Points {
x: number[] | Float32Array,
x: number[] | Float32Array,
z: number[] | Float32Array,
w: number[] | Float32Array,
}
In terms of API changes, your Points
interface has a double declaration of x
, and could maybe be used in your type definition for Draw? Other than that it looks good to me.
I especially like option 3.
, as I feel leaning into structs of arrays might be worthwhile for further optimisations down the road (e.g. reducing temporary memory allocation by using typed arrays under the hood) while also avoiding to have to expose the casual user to internals (state texture, kd bush index) too much.
@TheStanian It took a bit but I finally managed to support skipping the spatial index computation. See https://github.com/flekschas/regl-scatterplot/pull/178 for an example.
While I haven't had time to implement drawing of partial data, you can now precompute the spatial index (in a worker or main thread) and pass that index in with the draw call: scatterplot.draw(points, { spatialIndex })
. In the PR I'm showing a test I ran in Firefox where updating the Z/W coordinates for one million points is sped up 6x by skipping the re-indexing.
I'm still planning to add drawing of partial data (e.g., scatterplot.draw({ z: [...] })
) in which case the spatial indexing can be skipped automatically. But that's for another PR.
Right now, the only way of updating the point metadata (valueA and valueB, or whichever alias you prefer) is to do a full draw call including coordinate data. I have a specific application in which only the color indices for the points would change, leading to the rebuilding of the KDBush, which is in this case a major unnecessary slowdown (10m+ points). I would like to see a way to change only the point metadata, so the rebuild doesn't need to occur. This could for example take the form of taking the existing state texture management as used in setPoints and exposing that on its own, using the currently available points. Doing so would not break backwards compatibility with anything whatsoever.
Thank you for your consideration and with kind regards,
Stan