datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.51k stars 303 forks source link

Issue with Placename being two words #269

Open addisonp opened 4 years ago

addisonp commented 4 years ago

usaddress.RepeatedLabelError: ERROR: Unable to tag this string because more than one area of the string has the same label ORIGINAL STRING: 101 W Avenida Vista Hermosa Suite 122 San Clemente CA 92672 PARSED TOKENS: [('101', 'AddressNumber'), ('W', 'StreetNamePreDirectional'), ('Avenida', 'StreetName'), ('Vista', 'StreetNamePostType'), ('Hermosa', 'PlaceName'), ('Suite', 'OccupancyType'), ('122', 'OccupancyIdentifier'), ('San', 'PlaceName'), ('Clemente', 'PlaceName'), ('CA', 'StateName'), ('92672', 'ZipCode')] UNCERTAIN LABEL: PlaceName

bleckley commented 4 years ago

I've run into the same issue with Ann Arbor. Except it is parsing "Ann" into the StreetNameSuffix column, leaving "Arbor" as the PlaceName.