datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.52k stars 304 forks source link

Error processing address: 18778 RM 1431, ..., but processes 18778 FM 1431, ... #364

Open alekmarinov opened 6 months ago

alekmarinov commented 6 months ago
python
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import usaddress
>>> usaddress.tag("18778 FM 1431, Jonestown, TX, 78645")
(OrderedDict([('AddressNumber', '18778'), ('StreetNamePreType', 'FM'), ('StreetName', '1431'), ('PlaceName', 'Jonestown'), ('StateName', 'TX'), ('ZipCode', '78645')]), 'Street Address')
>>> usaddress.tag("18778 RM 1431, Jonestown, TX, 78645")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/alekm/.local/share/virtualenvs/castle-service-python-56XQLIFV/lib/python3.8/site-packages/usaddress/__init__.py", line 177, in tag
    raise RepeatedLabelError(address_string, parse(address_string),
usaddress.RepeatedLabelError:
ERROR: Unable to tag this string because more than one area of the string has the same label

ORIGINAL STRING:  18778 RM 1431, Jonestown, TX, 78645
PARSED TOKENS:    [('18778', 'SubaddressIdentifier'), ('RM', 'SubaddressType'), ('1431,', 'SubaddressIdentifier'), ('Jonestown,', 'PlaceName'), ('TX,', 'StateName'), ('78645', 'ZipCode')]
UNCERTAIN LABEL:  SubaddressIdentifier

When this error is raised, it's likely that either (1) the string is not a valid person/corporation name or (2) some tokens were labeled incorrectly

To report an error in labeling a valid name, open an issue at https://github.com/datamade/usaddress/issues/new - it'll help us continue to improve probablepeople!

For more information, see the documentation at https://usaddress.readthedocs.io/