Closed gauravcusp closed 6 years ago
I actually used the Intersect function in QGIS when building the NYC dataset. A goal for this project is to find the equivalent method using geopandas
QGIS Intersect function via https://docs.qgis.org/2.18/en/docs/user_manual/processing/vector_menu.html#geoprocessing-tools
You may need to research or ask in stackoverflow what the corresponding geopandas function is.
Spatial Overlay sounds right but please verify.
Hi @gauravcusp @lingyielia @vr00n, I performed spatial overlay as the intersection in this week. I got a problem that the final result of the intersection was incomplete. I wonder whether we could apply spatial overlay during the process. Please check the following notebook. week2
I think it's because the 'Council Boundaries of LA County' is a subset of the LA county, and 'intersection' only keep common areas between two datasets. In this situation, I think 'union' is more appropriate, which can keep the whole LA County.
Yes, 'Council Boundaries' is a subset of the LA county, and I tried 'union' between 'Council Boundaries' and 'Fire Boundaries'. It always made the kernel dead. Then I performed 'concat' to get the complete boundaries of councils in the LA county which worked. However, the problem presented at the last step, so I wonder whether we should use spatial overlay in this case.
The geopandas tools are computationally weak when it comes to large number of polygons. I would recommend using the function I used in the first notebook which has an option to perform union as well.
@gauravcusp agreed and thanks for flagging this.
Please carefully review these examples
More efficient Spatial Joins - https://gis.stackexchange.com/questions/102933/more-efficient-spatial-join-in-python-without-qgis-arcgis-postgis-etc/165413#165413
Efficient intersections https://gis.stackexchange.com/questions/227423/how-to-efficiently-determine-which-of-thousands-of-polygons-intersect-with-a-lin
Both examples use shapely with geopandas to improve performance. However you could also implement rtree
to create a spatial index before performing the regular geopandas intersection. This appears to offer many performance upgrades without having to learn a new package.
Shapely's manual is here - https://toblerity.org/shapely/manual.html
Thanks a lot, @vr00n . I have tried geopandas.join
after implementing r-tree, and it worked well. In the process, I applied both contains
and intersects
to get the final result. I think we should all use sjoin
to get the final shapefile. @gauravcusp @lingyielia
@xd515 , can you please report the performance of geopandas.sjoin
after implementing r-tree compared with the spatial_overlays
did here? Which one shall we use in our project?
Hi @gauravcusp @lingyielia @vr00n, I uploaded the intersection utilizing geopandas.sjoin
of four datasets, LAcounty_COMMUNITIES, Registrar Recorder Precincts, Census Block (2010) and School District Boundaries (2011). You could check it from here.
There's a problem when intersecting with Law Enforcement Reporting Districts.
Hi team any objections to closing this issue? Doing some housekeeping
Hey @vr00n . When you did the geo cross-walk for NYC, what method did you use? Spatial join or spatial overlay? Because the geopandas overlay function takes forever to run on big files such as census blocks or tracks. @lingyielia @xd515
How to submit an issue?
Draw.io
andlucidchart
are some excellent and quick drawing tools.SOLVERS
read through how your issue was solved and may help them solve their issue quicker.Stackedit.io
, a handy tool to quickly write in markdown?