datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.52k stars 304 forks source link

Address does not parse correctly #366

Open soapergem opened 4 months ago

soapergem commented 4 months ago

Here's an example of a valid US address which is not parsed correctly by this package:

import json
import usaddress

parsed = usaddress.parse("1509 Via Christina, Vista, CA 92084")
components = {x[1]: x[0] for x in parsed}
print(json.dumps(components, indent=2))

What happens here is that usaddress misinterprets "Vista" as the StreetNamePostType instead of the PlaceName, so we end up with this:

{
  "AddressNumber": "1509",
  "StreetNamePreType": "Via",
  "StreetName": "Christina,",
  "StreetNamePostType": "Vista,",
  "StateName": "CA",
  "ZipCode": "92084"
}

I would obviously expect it to handle addresses like this correctly.