gl-vis / gl-scatter2d

2D scatter plots
https://gl-vis.github.io/gl-scatter2d
MIT License
11 stars 5 forks source link

Notable splatting artifacts #8

Open monfera opened 7 years ago

monfera commented 7 years ago

White dots sometimes appear: image

Code:

/**
 * Test multiple points
 */
require('enable-mobile')
const setup = require('./')
const random = require('gauss-random')

//5e6 is allocation maximum
// var POINT_COUNT = 3e6
var POINT_COUNT = 1e6

var positions = new Float32Array(2 * POINT_COUNT)
for(var i=0; i<2*POINT_COUNT; ++i) {
  positions[i] = random() * 1
}

setup({
  positions:  positions,
  size:      5,
  color:     [0,0,0,0.1],
  borderSize: 0,
  snapPoints: true,
  borderColor: [.5,.5,.5,.5]
})

It happens on every few page reloads, also depending on the current screen size or aspect ratio, not sure.

This is the good version: image

monfera commented 7 years ago

Zooming is more informative: splatting

monfera commented 7 years ago

Looks like it's an inherent property of the current splatting algo, observable at a7769d3 or even e94a849 (the latter commit is best viewed with fully opaque points).

monfera commented 7 years ago

Besides the jump in apparent density, this example shows a nebulous rectangular shape in a Gaussian distribution:

splatting

monfera commented 7 years ago

Wondering if, instead of splatting, we'd be better off extending the pointcloud approach with marker shapes, and if needed, per point styling. Now pointcloud uses just the circle markers - with or without borders - but arbitrary shapes could be added with SDF or whatever other discard based approach. Similarly, there could be per-point color or shape, it just needs the appropriate array attribs. image

The pointcloud approach is not sensitive to zoom level and the automatic marker resizing - needed for performance - is helpful on its own right for tuning dataviz for readability, as lots of overlapping markers are indecipherable. It's way faster to initialize and point picking is very simple.

For an analogous example, consider recent research for autotuning the marker opacity channel: image

dy commented 7 years ago

@monfera this side effect seems to be an extra quadtree built upon the main tree, that is described in https://github.com/perouz/scatter-line-plots/blob/master/plotly.pdf. That should be fixable, but I don't completely understand the source code of snap-points-2d. I observed similar artifact with kdtrees though.

I am not sure if pointcloud approach should be the feature of gl-scatter/regl-scatter component. The splatting feature can be turned off by {snapPoints: false} option, but automatic resizing is likely should be done in userland. I would suggest that for this component it is rather important to keep point size corresponding to the one in options, - it should not be too smart to make any data-based decisions, that is a separate concern.

monfera commented 7 years ago

I guessed the artifact happens at level (LoD) switching, and inherent to the approach ie. not something with a straightforward fix. Rasterization does bring in artifacts even with a direct (non-splatting) approach. For example, in pointcloud, I jitter the marker size by adding a minuscule pseudorandom number in the shader code avoid jumps on zoom, because this way, the population of points don't all jump from covering 1 pixel to 2 pixels. But I think that the LoD switching is fundamental to the splatting approach, it's not done at the point level. If you think it's fixable while keeping the benefits of the approach, it might be worth a shot at some point.

Re my pointcloud related suggestion, we shouldn't act on it prior to discussing it and having approval from @etpinard, I just put forth an option. If we ignore the artifact issue, and the lowered usability when points heavily overlap, then we can at least have something that's a slot-in replacement of the current approach, which is the currently discussed route and also what you summarize above.

Regarding the userland control, indeed it's desirable to separate a dumb view from a more data and user experience aware layer that tweaks the point sizes (if it ever gets architectural green light). The only technical thing is that latency should be kept low, ie. the higher abstraction layer needs to be made aware of the zooming and panning (ie. in the rAF loop) and point size calculation and uniform modification should happen quickly. Analogy: moving it from pointcloud to convert.js in plotly.js, while also letting convert.js receive callbacks on pan&zoom, and letting it modify just the point size. (Here I assume you mean the dependent, plotly.js as userland code, as probably few plotly.js users would know of and bother with establishing such a nontrivial feedback loop.

dy commented 7 years ago

Remotely related. There is supposedly artifact-less approach to clustering points, based on distance sectioning rather than spatial. The concept is partly implemented by https://github.com/spotify/annoy