GreenBuildingRegistry / usaddress-scourgify

Clean US addresses following USPS pub 28 and RESO guidelines
MIT License
205 stars 47 forks source link

Address does not parse correctly #38

Open soapergem opened 4 months ago

soapergem commented 4 months ago

Here's an example of a valid US address which is not parsed correctly by scourgify:

import json
import scourgify

components = scourgify.normalize_address_record("1509 Via Christina, Vista, CA 92084")
print(json.dumps(components, indent=2))

What happens here is that scourgify tries to change the City name to the street direction, and you end up with this:

{
  "address_line_1": "1509 VIA CHRISTINA VIS",
  "address_line_2": null,
  "city": null,
  "state": "CA",
  "postal_code": "92084"
}

I would obviously expect it to handle addresses like this correctly.

soapergem commented 4 months ago

I just created a sister issue in the usaddress package but even if they don't fix that, I think scourgify can still be more intelligent about things. If there is no PlaceName then perhaps just interpret the StreetNamePostType as the city?