slu-openGIS / postmastr

R package for Processing and Parsing Untidy Street Addresses
https://slu-opengis.github.io/postmastr/
GNU General Public License v3.0
37 stars 8 forks source link

Some notes from a real use case #13

Open alankjackson opened 5 years ago

alankjackson commented 5 years ago

I just used the package to help turn a 200,000 record permit database into a geocoding file (https://cohgis-mycity.opendata.arcgis.com/datasets/permits-wm-structural?geometry=-97.509%2C29.379%2C-93.263%2C30.213)

A few things I noted.

A handful of records were missing the zip, and I didn't see a filter to remove those records, so I used:

geo2 <- pm_postal_detect(geo2) %>% 
  filter(pm.hasZip==TRUE) %>% 
  select(-pm.hasZip)

A few records had unit numbers attached to the house number (no blank), for example:

7301#1 AVE K, 77011 5146#2 LONGMONT DR, 77056 1212A W DREW ST, 77006 2002C GENESEE ST, 77006

so I cleaned them up with

geo2$pm.house <- str_remove(geo2$pm.house, "[A-Z#][0-9]*")

In general the package performed very well. Thanks for your efforts.

chris-prener commented 5 years ago

thanks @alankjackson - we're getting an update ready to deal with units like what you've described there. we don't intend to duplicate dplyr's subsetting functionality at this time - so that would be how we would recommend going about this. Glad to hear it is working well - we'll have a big update out soon!