datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.51k stars 303 forks source link

City not parsed correct #281

Open hema98 opened 4 years ago

hema98 commented 4 years ago

import usaddress usaddress.parse('480 wahington Blvd,Jersey City,NJ-07310')

Output: [('480', 'AddressNumber'), ('wahington', 'StreetName'), ('Blvd,', 'StreetNamePostType'), ('Jersey', 'PlaceName'), ('City,', 'SubaddressType'), ('NJ-07310', 'SubaddressIdentifier')]

'Jersey City' should be tagged to 'City'

sunnyisle1 commented 4 years ago

I noticed the address parses as expected if the dash between the state and zip is removed. Change "NJ-07310" to "NJ 07310" Seems like it works with either the space, or with a comma: "NJ,07310"