Using USAddress 0.5.10, under python 3.10.1, using usaddress.tag.
Case 1 -
`
usaddress.RepeatedLabelError:
ERROR: Unable to tag this string because more than one area of the string has the same label
ORIGINAL STRING: 9999 Walker LK Ontario Road,Hilton, NY 14468,US
PARSED TOKENS: [('9999', 'AddressNumber'), ('Walker', 'StreetName'), ('LK', 'StreetNamePostType'), ('Ontario', 'StreetName'), ('Road,', 'StreetNamePostType'), ('Hilton,', 'PlaceName'), ('NY', 'StateName'), ('14468,', 'ZipCode'), ('US', 'CountryName')]
UNCERTAIN LABEL: StreetName
`
It appears that LK as an abbreviation for LAKE, isn't being processed correctly.
Case 2 -
`
usaddress.RepeatedLabelError:
ERROR: Unable to tag this string because more than one area of the string has the same label
ORIGINAL STRING: Beech Street Corp PO Box 999999,Richardson, TX 75085-3925,US
PARSED TOKENS: [('Beech', 'StreetName'), ('Street', 'StreetNamePostType'), ('Corp', 'PlaceName'), ('PO', 'USPSBoxType'), ('Box', 'USPSBoxType'), ('999999,', 'USPSBoxID'), ('Richardson,', 'PlaceName'), ('TX', 'StateName'), ('75085-3925,', 'ZipCode'), ('US', 'CountryName')]
UNCERTAIN LABEL: PlaceName
Case 3 - ERROR: Unable to tag this string because more than one area of the string has the same label
ORIGINAL STRING: 99999 Bristol Blue St,Apex, NC 27502 4115,US
PARSED TOKENS: [('99999', 'AddressNumber'), ('Bristol', 'StreetName'), ('Blue', 'StreetName'), ('St,', 'StreetNamePostType'), ('Apex,', 'PlaceName'), ('NC', 'StateName'), ('27502', 'ZipCode'), ('4115,', 'ZipPlus4'), ('US', 'StateName')]
UNCERTAIN LABEL: StateName
`
Changing case 2 to Beech Street Corp, PO Box 999999,Richardson, TX 75085-3925,US
does parse, but I'm having issues with devising logic to handle this properly.
I have some situations where there are two address lines, and the parse failed, but succeeded when I removed the comma between the address lines.
Case 3 seems to be unaware of North Carolina?
Can you elaborate on the proper formatting of the input string? (e.g. include commas? Don't include line delimiters?)
The reason I ask is that I am seeing commas at the end of StreetNamePostType, and so forth?
Using USAddress 0.5.10, under python 3.10.1, using usaddress.tag.
Case 1 - ` usaddress.RepeatedLabelError: ERROR: Unable to tag this string because more than one area of the string has the same label
ORIGINAL STRING: 9999 Walker LK Ontario Road,Hilton, NY 14468,US PARSED TOKENS: [('9999', 'AddressNumber'), ('Walker', 'StreetName'), ('LK', 'StreetNamePostType'), ('Ontario', 'StreetName'), ('Road,', 'StreetNamePostType'), ('Hilton,', 'PlaceName'), ('NY', 'StateName'), ('14468,', 'ZipCode'), ('US', 'CountryName')] UNCERTAIN LABEL: StreetName ` It appears that LK as an abbreviation for LAKE, isn't being processed correctly.
Case 2 - ` usaddress.RepeatedLabelError: ERROR: Unable to tag this string because more than one area of the string has the same label
ORIGINAL STRING: Beech Street Corp PO Box 999999,Richardson, TX 75085-3925,US PARSED TOKENS: [('Beech', 'StreetName'), ('Street', 'StreetNamePostType'), ('Corp', 'PlaceName'), ('PO', 'USPSBoxType'), ('Box', 'USPSBoxType'), ('999999,', 'USPSBoxID'), ('Richardson,', 'PlaceName'), ('TX', 'StateName'), ('75085-3925,', 'ZipCode'), ('US', 'CountryName')] UNCERTAIN LABEL: PlaceName
Case 3 -
ERROR: Unable to tag this string because more than one area of the string has the same labelORIGINAL STRING: 99999 Bristol Blue St,Apex, NC 27502 4115,US PARSED TOKENS: [('99999', 'AddressNumber'), ('Bristol', 'StreetName'), ('Blue', 'StreetName'), ('St,', 'StreetNamePostType'), ('Apex,', 'PlaceName'), ('NC', 'StateName'), ('27502', 'ZipCode'), ('4115,', 'ZipPlus4'), ('US', 'StateName')] UNCERTAIN LABEL: StateName ` Changing case 2 to Beech Street Corp, PO Box 999999,Richardson, TX 75085-3925,US does parse, but I'm having issues with devising logic to handle this properly.
I have some situations where there are two address lines, and the parse failed, but succeeded when I removed the comma between the address lines.
Case 3 seems to be unaware of North Carolina?
Can you elaborate on the proper formatting of the input string? (e.g. include commas? Don't include line delimiters?)
The reason I ask is that I am seeing commas at the end of StreetNamePostType, and so forth?