datamade / usaddress

:us: a python library for parsing unstructured United States address strings into address components
https://parserator.datamade.us/usaddress
MIT License
1.5k stars 302 forks source link

tagging errors #348

Open joshred83 opened 1 year ago

joshred83 commented 1 year ago

The following gets a repeated label error:

usaddress.tag("123 SMITHTON AVE # 3 FL QUEENS 12354 4321")

It isn't correctly identifying the occupancy identifier. A similar record:

usaddress.tag("123 SMITHTON AVE # 3 QUEENS 12354 4321")

Incorrectly identifies "# 3 QUEENS 12354 4321" as the occupancy identifier.

Do you have any recommended preprocessing steps?