street-address-rb / street-address

Detect, and dissect, US Street Addresses in strings.
MIT License
168 stars 85 forks source link

Handle streets where the name overlaps the street-type map #33

Open hannahwhy opened 8 years ago

hannahwhy commented 8 years ago

Some addresses like

14168 W RIVER RD
COLUMBIA STATION, OH 44028-9430

are interpreted with the street being "W", the street type as "River" (which abbreviates to "riv"), and the city as "RD \nCOLUMBIA STATION". The example addresses are all in Ohio because that's my current data set, but it's not an Ohio-specific phenomenon: for example, in Illinois, there's a River Road that follows the Des Plaines River.

The erroneous parse appears to be from the "100 South Street" special case in the street regexp. This commit adds a second special case with higher precedence, matching [prefix, non-numeric street, street type] sequences. The street match excludes numerics to preserve the existing parse behavior for the "6641 N 2200 W Apt D304 Park City, UT 84098" case.

SalvatoreT commented 8 years ago

@derrek, this seems pretty neat.

jsmestad commented 7 years ago

Really would like to see this. "Ridge Road" suffers from this as well.