Closed mzagorskirs closed 2 years ago
For this task, I developed an aggregation query to sum the hail sizes per GeoHash precision level 4 cell. The resulting heatmap looks the same as the original (non-aggregated heatmap), as expected, however, only 1556 features were called by GeoWaveFeatureReader instead of all 13,742 features in the hail dataset. So, the aggregation requires fewer calls, however, there is a bit more processing involved to build the aggregation query and convert the results to SimpleFeatures. The overall runtime is not necessarily shorter for the aggregation heatmap, but the network load would be reduced:
Also, for this task, I developed a count metric and created a heatmap for it. The heatmap looks similar to the one above, as expected. This is the benefit of downscaling: similar resulting heatmap with fewer points and reduced load on networks.
Zooming in on the aggregate heatmaps reveals the regular grid pattern of the GeoHash centroid:
And, zooming in a bit more really reveals the regular grid pattern of the GeoHash centroid (where the data has been aggregated): (This is GeoHash precision level 4)
Whereas, for comparison, the original heatmap (non-aggregated data) zoomed-in:
I verified the accuracy of both the field sum and field count aggregation results, for example, the count of points within this GeoHash precision level 4 cell is 2, as shown by the attribute on the centroid point of the GeoHash centroid:
Task demo'd and approved.
approved
Epic: https://github.com/mzagorskirs/geowave/issues/3
Note: stick to either aggregations or statistics, and do not tackle both in the initial work: