harsha2010 / magellan

Geo Spatial Data Analytics on Spark
Apache License 2.0
534 stars 149 forks source link

Support for distance join between points #219

Open pinireisman opened 6 years ago

pinireisman commented 6 years ago

For parallelizing clustering algorithms at scale, it would be very useful to have an optimized join query between sets of points to find pairs of points that are within distance X from one another (similar to the Distance Join in the GeoSpark-SQL project)

One can achieve that by building polygons around one set of points and then using the "point within polygon" query, but this seems wasteful.

Any plans for such a join?

thanks!