street-address-rb / street-address

Detect, and dissect, US Street Addresses in strings.
MIT License
168 stars 85 forks source link

Does not handle street names within STREET_TYPES hash when informal: true #18

Closed aahmad closed 9 years ago

aahmad commented 9 years ago

If there is a street name as:

> StreetAddress::US.parse("13 N. Wells Street", informal: true)
=> 13 N. Wls

Wells is a street name which is also a key in the STREET_TYPES hash (the same is true for any street name as a key to that hash).

John-Nagle commented 9 years ago

This is an inherent limitation of the parsing method. I ported the same parser to Python, and have roughly the same class of problems. Basically, you can parse about 95% of US addresses with simple regular expressions. The remaining 5% require a much greater effort.

Proper US street parsing, per USPS rules, is right to left, bottom to top. There's less ambiguity towards the end of the address. That has more hope of working without a full street name database. Working back to front, once you've seen street type, a second street type is probably an error. But there are exceptions in Salt Lake City and parts of Brooklyn.

derrek commented 9 years ago

@John-Nagle is correct. The approach take in this code makes it hard to deal with the issue you list. I don't have time and/or enough brain power to rethink this code base from the ground up to solve. If you can fix it I'll accept a pull request.