Closed keithfraley closed 3 weeks ago
@keithfraley Check out the branch https://github.com/koopjs/koop-provider-elasticsearch/tree/koop-4.x-update to see upcoming changes that take advantage of built in ES geohash aggregation.
Exciting
On Tue, Nov 3, 2020, 1:39 PM Danny Hatcher notifications@github.com wrote:
@keithfraley https://github.com/keithfraley Check out the branch https://github.com/koopjs/koop-provider-elasticsearch/tree/koop-4.x-update to see upcoming changes that take advantage of build in ES geohash aggregation.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/koopjs/koop-provider-elasticsearch/issues/13#issuecomment-721335876, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6LCVFUJWKFFAM7GFVO65LSOBL7TANCNFSM4FIR2ZWQ .
noticed that the geohash seems to be having some issues when aggregating large areas, it would appear that the precision is decided on the width of the bounding box? So that as we get closer to the equator the precision factor is skewed (see screenshots)
I also noticed that we do on occasion get overlapping aggregation, I think this can be handled, potentially by reducing the numebr
of the records brought back.
In addition, a couple enhancements as we move forward, the aggregation component if es also provides a centroid type geo_point. The really cool part about this is that it is actually the centroid of the records with the geohash, not the geohash itsself. Gives a more accurate location than the polygons, its really useful when building historical movements
The other enhancements is the ability to do more than just counts for aggregations, for example if we want to sum a field for that geohash, that is a pretty easy enhancements by adding that param to the config.
something like
if (indexConfig.aggregation.type !== 'count') {
esQuery.body.aggs["2"].aggs["1"] = {
[indexConfig.aggregation.type]: {
"field": indexConfig.aggregation.field
},
}
}
The aggregation scale is determined based on the height of the bounding box that's right, and since we treat every query separately you can have issues like the above show up sometimes. If you want to post those enhancements as new issues we can mark them as new enhancements and track them.
Ok, I will post some of my thoughts here on this, starting with the config file. I think a config file would need to following changes in order to trigger results as an aggregation. basically the aggregation feature would return a hash polygon leveraging the geojson that is returned from ES.
Getting the spatial results back should be very straight forward. The hard part will be in ensure that all the attribute fields still come back across because of the need for filtering and time series interactions.