Closed dcolombara closed 1 month ago
The difference probably comes from using a different ZIP shapefile ("/APDE-CDIP/Shapefiles/ZIP/zipcode.shp" is used instead of Shapefiles_protected/ZIP/adci_wa_zip_raw.shp). We could/should screen out the small geographic intersections, but I don't think it should hurt any computation at the moment.
Would you be able to update the crosswalk to use the better ZIP code map? It's not urgent, but I don't want to forget about it.
I took this opportunity to play a little with leaflet. I don't love the fact that our crosswalk currently includes ZIP codes that are not even close to KC. I think this is because the other shapefile you used is wonky (as you suggested). For example, it has much bigger ZIP code areas (because it includes the surrounding water, even though no one lives in the middle of Puget Sound).
# load packages ----
library(sf)
library(spatagg)
library(leaflet)
# load shapefiles
newhra = st_read('//dphcifs/APDE-CDIP/Shapefiles/HRA/hra_2020.shp')
newhra = newhra[, c('id', 'name', 'reg_id', 'reg_nm')]
newhra <- st_transform(newhra, 4326)
zip <- st_read("//dphcifs/APDE-CDIP/Shapefiles_protected/ZIP/adci_wa_zip_raw.shp")
names(zip)[names(zip) == 'POSTCOD'] <- 'ZIP'
zip <- st_transform(zip, 4326)
kcshape <- st_read("//Kcitfsrprpgdw01/kclib/Plibrary2/politicl/shapes/polygon/kingco.shp")
kcshape <- st_transform(kcshape, 4326)
# ZIP to HRA crosswalk
z2h_fo = create_xwalk(source = zip, target = newhra, source_id = 'ZIP', target_id = 'id', min_overlap = 0, method = "fractional overlap")
# Subset zip shapefile to zipcodes crosswalked by spatagg using the clean zipcode data
xwalkedZIP <- unique(z2h_fo$source_id)
xwalkedZIPshape <- zip[zip$ZIP %in% xwalkedZIP, ]
# plot shows that we selected the correct zipcodes
leaflet() %>%
addTiles() %>%
addPolygons(data = xwalkedZIPshape,
color = "white",
weight = 2,
fillOpacity = 0) %>%
addPolygons(data = kcshape,
color = "blue",
weight = 2,
fillOpacity = 0.2,
fillColor = "blue") %>%
addProviderTiles(providers$Esri.WorldImagery)
# plot showing some zipcodes currently crosswalked in rads.data are entirely outside of King County
currentZIP <- unique(rads.data::spatial_zip_to_hra20_geog[]$ZIP)
currentzipshape <- zip[zip$ZIP %in% currentZIP, ]
leaflet() %>%
addTiles() %>%
addPolygons(data = currentzipshape,
color = "white",
weight = 2,
fillOpacity = 0) %>%
addPolygons(data = kcshape,
color = "blue",
weight = 2,
fillOpacity = 0.2,
fillColor = "blue") %>%
addProviderTiles(providers$Esri.WorldImagery)
spatial_zip_to_hra20_geog
might have issues.Kai IM'd me to ask what xwalk he should use to identify ZIP codes in KC. I suggested
spatial_zip_to_hra20_pop
orspatial_zip_to_hra20_geog
.He wrote back and said there were some zips outside of KC. I thought they would be those that adjoin KC (e.g., 98026 where Edmonds touches Shoreline and gives us a tiny s2t_fraction). However, I took the time to map it out and it looks like there might be some outside of KC (run the code to see them on a map).
It's possible that I screwed up the shapefile processing, but can you can a look and make sure that our ZIP crosswalks are legit? The attached code should run fine on your computer and only take a minute.