pwoods25443 commented 8 years ago

When the cluster (heatmap) animation displays points with weight < 1.0 they get too faint to see too quickly as the value moves away from 1.0 so that 0.5 is practically invisible. Let's try just setting a minimum of 1.0 in the webgl rendering. Previously we were doing this at tile generation time.

pwoods25443 commented 8 years ago

OK , so it looks like weight < 1.0 is already treated differently here:

https://github.com/SkyTruth/pelagos-client/blob/devel/js/app/Visualization/Animation/ClusterAnimation-vertex.glsl#L53

I propose that we treat it differently, but in a different way.

Let's just force values < 1.0 to be equal to 1.0 and see how that looks.

We may want to make this more sophisticated by tying this manipulation to the setting of the intensity slider so that as you turn up the intensity we compress values in the range [0.0, 1.0] toward 1.0 such that when the intensity is at maximum, all values [0.0, 1.0] are mapped to the range [1.0, 1.0], and when the intensity is at minimum we leave the values unchanged.

enriquetuya commented 8 years ago

@redhog how difficult would it be to implement what Paul suggested?

enriquetuya commented 8 years ago

In the meantime we are trying this approach: https://github.com/SkyTruth/benthos-pipeline/issues/479

redhog commented 8 years ago

Very easy, but it would make things worse for clustering. We need to set the value to 1 in the TILESET pipeline before we cluster, so that the intensity of clusters include this risen intensity, or we would have one more source of the zoom problem we're already having...

enriquetuya commented 8 years ago

We still need to address this. Including Paul's comment after his review of the latest tileset with weight changes: https://github.com/SkyTruth/benthos-pipeline/issues/411#issuecomment-213111981

andres-arana commented 8 years ago

Before we discuss about this we need to all know how it all works right now, so here we go. These are things @redhog already knows (and if you find any error let's talk here about them), but @enriquetuya and @geowurster may no be aware of.

The current state of things

The pipeline

The parameters that have some degree of influence on the interval are the cluster sigma and weight. The pipeline part is somewhat straightforward. Each displayed record may be a single-record cluster or a multi-record cluster.

For single record clusters, the sigma is zero, and the weight is calculated by multiplying the fishing score by the amount of seconds since the last message of the same track.

For multiple-record clusters, the weight is the sum of the weights of the clustered clusters, and the sigma is the standard deviation in latitude and longitude (really, the pythagoras - sqrt(stddev(lat)^2 + stddev(lon)^2) ), calculated by recording the sum of squares and all that.

The renderer

The actual rendering logic in the client is fully configurable with some black attrmapper magic. Basically, each parameter of the rendering process (like the weight, but may be any of the attributes of a single record) goes into a polynomial equation that gets the actual values used in the renderer.

In most workspaces, it just that vWeight (the render parameter) is 0.2 * weight (the record value), but all those parameters can be tuned in the client through the editing sidebar (and that's exactly how the intensity slider works).

Now, on to the rendering process itself. The renderer uses the sigma as a way to increase or decrease the radius of the points that it's drawing, and the weight as a way to increase or decrease the transparency of the point (higher weight, less transparent). It's using a special logic for values greater than one in that it applies a logarithmic scales for these, this is because weight values are essentially unbounded.

The idea of this process is that if you have 4 points at one some level that get clustered at zoom level - 1, then the overall intensity of the area is still the same: lower down, I had 4 points with intensity x, totallying intensity 4x on the area, higher up I have one cluster with intensity 4x covering more or less the same area, so the intensity remains constant.

The problem

The idea looks good on paper, but the problem is that this logic means that each time you zoom in you get clusters with necessarily less weight than the previous zoom level in the same area. Eventually, you get to the threshold were clusters are very faint / invisible.

Conceptually, I think that the problem lies in that while intensity remains constant in world area, it varies greatly in screen area due to the zooming itself. At zoom level a I'm seeing a single circle, real world area ra, with intensity ia, that takes ra' screen space. When I zoom in, I see 4 clusters, each with intensity ia/4, that are located inside the same real world area ra, but now that area ra takes a much larger rb area of screen space. Thus, while the overall intensity still holds for world coordinates, the resulting intensity per screen area in screen space is much lower as the area itself is bigger.

One thing to note is that processing weights 0 < w < 1 differently will not necessarily fix anything, because there are cases in which a weight in that interval has nothing to do with zooming. For example, if I take the intensity slider to the left because I just want to focus on very strong clusters, I may be processing points with final intensity < 1, but I want those clusters to be hidden, so it's ok to display them very faintly.

Another thing to note here is that it might be difficult to make it work by changing the shader code only (for example, by mapping values using other functions, parametrizable by the intensity slider) because the appropriate values of the parameter depend on the zoom level / maximum and minimum intensity of the zoom level you are currently in. And we have to take into account that some cluster may have a lower intensity because of the data itself, while some clusters may have a lower intensity due to the zooming logic.

Solutions

We can do a bunch of stuff in here, and we would have to test to see how it looks, but some suggestions on first possible steps could be these:

Instead of summing weights on multi-record clusters, use the average intensity of the included sub-clusters as the intensity of the cluster. This does not preserve intensity across areas, but ensures that cluster weights is not monotonically increasing across zoom levels. It's not perfect though: if you have a cluster with weight 1 you may have a lot of subclusters with weight 0.1 and one with weight 15 in there that you might not be able to see from the previous zoom level.
Instead of summing weights, we could take the maximum intensity of the included sub-clusters as the intensity of the cluster. This allows you to easily detect zones with events with high intensity from higher up, but provides no way to differentiate between a cluster with one very high intensity on one subcluster versus one with a lot of somewhat high intensity subclusters.
Adjust the intensity slider maximum and minimum values across zoom levels, so that we compensate when we zoom in by increasing the weight polynomial parameter as you zoom in more. This seems more difficult to implement, but might get better results than the previous 2.
Maybe we could use some other statistic besides averaging to compute the summarizing weight of a cluster based on the distribution of the subcluster weights. Maybe could do something with averages and stdev. I'm not sure about this though, it requires a lot more thought.

redhog commented 8 years ago

Excellent summary @andres-arana . I feel like this is exactly what was missing from the discussion. @enriquetuya and @geowurster please read it!

I made a small clarification of what sigma means.

enriquetuya commented 8 years ago

@pwoods25443 Which of the solutions stated by @andres-arana do you think is best for start working on? If it is the latest one, please let us know which other statistics you think could work.

pwoods25443 commented 8 years ago

@enriquetuya I think we have to stick with summing of the weights - this is a clustering of individual points, so the cluster has to sum. our problem comes when we have clusters with just a few points in them. At that point I think what you really want is to just switch modes and go from clustering to just fixed intensity points. One way to do this is to switch modes based on the total number of accumulated clusters in the tile, and when that number gets below some threshold, then make all the points have an weight of 1 and sigma 0. We could do this in the tile builder instead of in the client so it happens dynamically at the right zoom level.

GlobalFishingWatch / pelagos-client

Webgl rendering should treat weight < 1.0 differently than weight > 1.0 #144

The current state of things

The pipeline

The renderer

The problem

Solutions