A lot of shapefiles come in much bigger than you really want (e.g. for the whole country or state, but you are only interested in a specific district).
This is particularly relevant for shapefiles retrieved from the census ftp site. but it seems applicable to anytime you're working with shapefiles. I know that almost every time I want to make a map I end up having to open it in QGIS and delete all the extra shapes I have that aren't relevant.
Workflow Example:
Get data from census API (e.g. block groups in Harris County, Texas)
Get shapefile from census FTP (only shapefile available is for ALL block groups Texas, which is way more info than we want)
Use this new feature to trim the shapefile to be the same "size" as the data you have, based on the GEOIDs that you got from the API call.
make a map with the data and the trimmed shapefile.
Proposal
Make a utility to remove all the unwanted shapes from the shapefile
You could specify which shapefile property to look at, and then remove all that shapefiles that don't have a property in that list.
>>> # Example of how it might be used
>>> include_these_GEOIDs = [“1234”, “5678”, ….]
>>> bbd.trim_shapefile(
original_path=”...”,
join_on=”GEOID”,
include=include_these_GEOIDs,
new_path=None, # perhaps would be original_path + “_trimmed” by default
)
Motivation
A lot of shapefiles come in much bigger than you really want (e.g. for the whole country or state, but you are only interested in a specific district).
This is particularly relevant for shapefiles retrieved from the census ftp site. but it seems applicable to anytime you're working with shapefiles. I know that almost every time I want to make a map I end up having to open it in QGIS and delete all the extra shapes I have that aren't relevant.
Workflow Example:
Proposal
Make a utility to remove all the unwanted shapes from the shapefile You could specify which shapefile property to look at, and then remove all that shapefiles that don't have a property in that list.