Open seamusdu opened 7 years ago
Ideas and cross-references: https://github.com/Esri/spatial-framework-for-hadoop/issues/28 http://stackoverflow.com/questions/38963487/how-to-optimize-scan-of-1-huge-file-table-in-hive-to-confirm-check-if-lat-long/ http://gis.stackexchange.com/questions/178732/geospatial-queries-and-indexes-in-memory/ http://thunderheadxpler.blogspot.com/2013/10/bigdata-spatial-joins.html http://getindata.com/blog/post/geospatial-analytics-on-hadoop/ https://cwiki.apache.org/confluence/display/Hive/Spatial+queries https://github.com/Esri/spatial-framework-for-hadoop/issues/82
If your polygon dataset can fit into memory, build an in-memory quadtree index on the polygons using the Geometry API, by adapting for Spark the MapReduce sample in the GIS-Tools-for-Hadoop.
Hi @randallwhitman
Thanks for your reply. The sample using quadtree index does help and I will try to use the Geometry API for Spark.
@seamusdu How did you find the running the Spatial Framework on Spark in the end, it is an option I'm looking at at the moment?
Cross-reference re Spark: #97 (works with JsonSerde as of v1.2)
@seamusdu I am doing the same thing and wrapper spatial join query with index in geospark
R package.
Has anyone tried to make a benchmarking with number of points and time that took to process them? Or even a comparison between Hive and MapReduce(with spatial indexing)?
@guillemfrancisco There is a little bit of info in comment under - https://stackoverflow.com/questions/38963487/how-to-optimize-scan-of-1-huge-file-table-in-hive-to-confirm-check-if-lat-long
I am trying to use HiveContext within Spark to use this spatial framework and it does work. However, once I use a large dataset, it seems that the performance will decline dramatically. I am trying to count points within polygons. Hence, I wonder whether you have done any performance test, which can probably explain the performance of this framework. Also, have you ever considered creating a spatial index, which might improve the performance of spatial operations.
Thanks.