datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.51k stars 303 forks source link

ERROR: Unable to tag this string because more than one area of the string has the same label #305

Open shunlictl opened 3 years ago

shunlictl commented 3 years ago

Trying to parse Google formatted address: The Boulder Apartments, 210 Simpson Parkway #R-100, 210 Simpson-Parkway, Cheney, WA 99004, USA

usaddress.tag("The Boulder Apartments, 210 Simpson Parkway #R-100, 210 Simpson-Parkway, Cheney, WA 99004, USA") Traceback (most recent call last): File "", line 1, in File "/app/qfmig/.local/lib/python3.6/site-packages/usaddress/init.py", line 178, in tag label) usaddress.RepeatedLabelError: ERROR: Unable to tag this string because more than one area of the string has the same label

ORIGINAL STRING: The Boulder Apartments, 210 Simpson Parkway #R-100, 210 Simpson-Parkway, Cheney, WA 99004, USA PARSED TOKENS: [('The', 'Recipient'), ('Boulder', 'Recipient'), ('Apartments,', 'Recipient'), ('210', 'AddressNumber'), ('Simpson', 'StreetName'), ('Parkway', 'StreetNamePostType'), ('#', 'OccupancyIdentifier'), ('R-100,', 'OccupancyIdentifier'), ('210', 'AddressNumber'), ('Simpson-Parkway,', 'StreetName'), ('Cheney,', 'PlaceName'), ('WA', 'StateName'), ('99004,', 'ZipCode'), ('USA', 'CountryName')] UNCERTAIN LABEL: AddressNumber

When this error is raised, it's likely that either (1) the string is not a valid person/corporation name or (2) some tokens were labeled incorrectly

To report an error in labeling a valid name, open an issue at https://github.com/datamade/usaddress/issues/new - it'll help us continue to improve probablepeople!

For more information, see the documentation at https://usaddress.readthedocs.io/

usaddress.parse("The Boulder Apartments, 210 Simpson Parkway #R-100, 210 Simpson-Parkway, Cheney, WA 99004, USA") [('The', 'Recipient'), ('Boulder', 'Recipient'), ('Apartments,', 'Recipient'), ('210', 'AddressNumber'), ('Simpson', 'StreetName'), ('Parkway', 'StreetNamePostType'), ('#', 'OccupancyIdentifier'), ('R-100,', 'OccupancyIdentifier'), ('210', 'AddressNumber'), ('Simpson-Parkway,', 'StreetName'), ('Cheney,', 'PlaceName'), ('WA', 'StateName'), ('99004,', 'ZipCode'), ('USA', 'CountryName')] usaddress.tag("The Boulder Apartments, 210 Simpson Parkway #R-100, 210 Simpson-Parkway, Cheney, WA 99004, USA") Traceback (most recent call last): File "", line 1, in File "/app/qfmig/.local/lib/python3.6/site-packages/usaddress/init.py", line 178, in tag label) usaddress.RepeatedLabelError: ERROR: Unable to tag this string because more than one area of the string has the same label

ORIGINAL STRING: The Boulder Apartments, 210 Simpson Parkway #R-100, 210 Simpson-Parkway, Cheney, WA 99004, USA PARSED TOKENS: [('The', 'Recipient'), ('Boulder', 'Recipient'), ('Apartments,', 'Recipient'), ('210', 'AddressNumber'), ('Simpson', 'StreetName'), ('Parkway', 'StreetNamePostType'), ('#', 'OccupancyIdentifier'), ('R-100,', 'OccupancyIdentifier'), ('210', 'AddressNumber'), ('Simpson-Parkway,', 'StreetName'), ('Cheney,', 'PlaceName'), ('WA', 'StateName'), ('99004,', 'ZipCode'), ('USA', 'CountryName')] UNCERTAIN LABEL: AddressNumber

When this error is raised, it's likely that either (1) the string is not a valid person/corporation name or (2) some tokens were labeled incorrectly

To report an error in labeling a valid name, open an issue at https://github.com/datamade/usaddress/issues/new - it'll help us continue to improve probablepeople!

For more information, see the documentation at https://usaddress.readthedocs.io/