slu-openGIS / postmastr

R package for Processing and Parsing Untidy Street Addresses
https://slu-opengis.github.io/postmastr/
GNU General Public License v3.0
37 stars 8 forks source link

Need unit functionality help? #25

Open mtdukes opened 2 years ago

mtdukes commented 2 years ago

Love the functionality that already exists in postmastr, and it looks like there's a start on the issue of units. I've got a dataset with a lot of suits, boxes, floors etc. and I'd love to be able to use this feature with the package, although it appears it's not functional yet.

Is there any way I can contribute to extend this functionality?

reprex

library(tidyverse)
library(postmastr)
reprex_address <- tibble(address = c('188 E CAPITOL STREET 300 ONE JACKSON PLACE, JACKSON, MS, 39201',
  '160 MINE LAKE CT STE 200, RALEIGH, NC, 27615',
  '7491 N FEDERAL HWY SUITE C 5 275, BOCA RATON, FL, 33487',
  '176 MINE LAKE COURT SUITE 100, RALEIGH, NC, 27615',
  'GENERAL SERVICES CORPORATION 2922 HATHAWAY ROAD, RICHMOND, VA, 23225-1724',
  'ATTN JENNY BELOTE CORPORATE OFFICE 16 CONSULTANT PLACE SUITE 104, DURHAM, NC, 27707-6313'))

reprex_cities <- pm_append(type = 'city',
                           input = c('JACKSON', 'RALEIGH', 'BOCA RATON', 'RICHMOND', 'DURHAM'),
                           output = c(NA, NA, NA, NA, NA))

reprex_pm_address <- reprex_address %>% 
  pm_identify(var = 'address')

reprex_pm_address %>%  
  pm_parse(input = 'full',
           address = 'address',
           output = 'short',
           keep_parsed = 'yes',
           city_dict = reprex_cities,
           include_units = TRUE
  )
#> # A tibble: 6 × 10
#>   address  pm.address pm.house pm.preDir pm.street pm.streetSuf pm.city pm.state
#>   <chr>    <chr>      <chr>    <chr>     <chr>     <chr>        <chr>   <chr>   
#> 1 188 E C… 188 E Cap… 188      E         Capitol … <NA>         JACKSON MS      
#> 2 160 MIN… 160 Mine … 160      <NA>      Mine Lak… <NA>         RALEIGH NC      
#> 3 7491 N … 7491 N Fe… 7491     N         Federal … <NA>         BOCA R… FL      
#> 4 176 MIN… 176 Mine … 176      <NA>      Mine Lak… <NA>         RALEIGH NC      
#> 5 GENERAL… General S… <NA>     <NA>      General … Rd           RICHMO… VA      
#> 6 ATTN JE… Attn Jenn… <NA>     <NA>      Attn Jen… <NA>         DURHAM  NC      
#> # … with 2 more variables: pm.zip <chr>, pm.zip4 <chr>

reprex_pm_address %>% 
  pm_has_unit()
#> Error in pm_has_unit(.): Error 2.
chris-prener commented 2 years ago

Thanks for reaching out @mtdukes! Unfortunately, my capacity for package development/maintenance has been greatly reduced. We made progress on this before the pandemic took off. There is an open PR that addresses this functionality. Do you want to give it a whirl?

mtdukes commented 2 years ago

Thanks for the response @chris-prener! I will check it out.