argo-marketplace / LA-LocalGeo-CrossWalk

A single file that links up all the local geographies in LA County
Apache License 2.0
6 stars 2 forks source link

Intersect geographies using Spatial Join or Spatial Overlay? #8

Closed gauravcusp closed 6 years ago

gauravcusp commented 6 years ago

Hey @vr00n . When you did the geo cross-walk for NYC, what method did you use? Spatial join or spatial overlay? Because the geopandas overlay function takes forever to run on big files such as census blocks or tracks. @lingyielia @xd515

How to submit an issue?

vr00n commented 6 years ago

I actually used the Intersect function in QGIS when building the NYC dataset. A goal for this project is to find the equivalent method using geopandas

QGIS Intersect function via https://docs.qgis.org/2.18/en/docs/user_manual/processing/vector_menu.html#geoprocessing-tools

image

You may need to research or ask in stackoverflow what the corresponding geopandas function is.

Spatial Overlay sounds right but please verify.

xd515 commented 6 years ago

Hi @gauravcusp @lingyielia @vr00n, I performed spatial overlay as the intersection in this week. I got a problem that the final result of the intersection was incomplete. I wonder whether we could apply spatial overlay during the process. Please check the following notebook. week2

lingyielia commented 6 years ago

I think it's because the 'Council Boundaries of LA County' is a subset of the LA county, and 'intersection' only keep common areas between two datasets. In this situation, I think 'union' is more appropriate, which can keep the whole LA County.

xd515 commented 6 years ago

Yes, 'Council Boundaries' is a subset of the LA county, and I tried 'union' between 'Council Boundaries' and 'Fire Boundaries'. It always made the kernel dead. Then I performed 'concat' to get the complete boundaries of councils in the LA county which worked. However, the problem presented at the last step, so I wonder whether we should use spatial overlay in this case.

gauravcusp commented 6 years ago

The geopandas tools are computationally weak when it comes to large number of polygons. I would recommend using the function I used in the first notebook which has an option to perform union as well.

vr00n commented 6 years ago

@gauravcusp agreed and thanks for flagging this.

Please carefully review these examples

  1. More efficient Spatial Joins - https://gis.stackexchange.com/questions/102933/more-efficient-spatial-join-in-python-without-qgis-arcgis-postgis-etc/165413#165413

  2. Efficient intersections https://gis.stackexchange.com/questions/227423/how-to-efficiently-determine-which-of-thousands-of-polygons-intersect-with-a-lin

Both examples use shapely with geopandas to improve performance. However you could also implement rtree to create a spatial index before performing the regular geopandas intersection. This appears to offer many performance upgrades without having to learn a new package.

  1. A solid example of how to implement spatial indexing using r-tree - https://snorfalorpagus.net/blog/2014/05/12/using-rtree-spatial-indexing-with-ogr/

Shapely's manual is here - https://toblerity.org/shapely/manual.html

xd515 commented 6 years ago

Thanks a lot, @vr00n . I have tried geopandas.join after implementing r-tree, and it worked well. In the process, I applied both contains and intersects to get the final result. I think we should all use sjoin to get the final shapefile. @gauravcusp @lingyielia

lingyielia commented 6 years ago

@xd515 , can you please report the performance of geopandas.sjoin after implementing r-tree compared with the spatial_overlays did here? Which one shall we use in our project?

xd515 commented 6 years ago

Hi @gauravcusp @lingyielia @vr00n, I uploaded the intersection utilizing geopandas.sjoin of four datasets, LAcounty_COMMUNITIES, Registrar Recorder Precincts, Census Block (2010) and School District Boundaries (2011). You could check it from here. There's a problem when intersecting with Law Enforcement Reporting Districts.

patwater commented 6 years ago

Hi team any objections to closing this issue? Doing some housekeeping