:us: a python library for parsing unstructured United States address strings into address components
1.5k
stars
302
forks
source link
Error parsing La in city name (i.e. La Quinta) as Louisiana using .tag #358
Open
michaeljclausen opened 7 months ago
I've had some addresses work fine such as...
"49000 Calle Flora, La Quinta, CA 92253 United States" (OrderedDict([('AddressNumber', '49000'), ('StreetName', 'Calle Flora'), ('PlaceName', 'La Quinta'), ('StateName', 'CA'), ('ZipCode', '92253'), ('CountryName', 'United States')]), 'Street Address')
whereas "8100 Peary Place, La Quinta, California 92253 United States" results in...
ERROR: Unable to tag this string because more than one area of the string has the same label
ORIGINAL STRING: 8100 Peary Place, La Quinta, California 92253 United States PARSED TOKENS: [('8100', 'AddressNumber'), ('Peary', 'StreetName'), ('Place,', 'StreetNamePostType'), ('La', 'StateName'), ('Quinta,', 'PlaceName'), ('California', 'StateName'), ('92253', 'ZipCode'), ('United', 'CountryName'), ('States', 'CountryName')] UNCERTAIN LABEL: StateName
After testing, it seems having a street address end in 'Place' trips up the parser.