HI there, I appreciate your work on normalizing the US Address library!
I notice the following issue:
>>> from scourgify import normalize_address_record
>>> normalize_address_record('12345 Somewhere Street Apt 1, Town, MA 12345')
{'address_line_1': '12345 SOMEWHERE ST', 'address_line_2': 'UNIT 1', 'city': 'TOWN', 'state': 'MA', 'postal_code': '12345'}
I believe it has to do with the following bit of code in scourgify.normalize.normalize_occupancy_type:
default = default if default is not None else 'UNIT'
occupancy_type_label = 'OccupancyType'
occupancy_type = parsed_addr.pop(occupancy_type_label, None)
occupancy_type_abbr = OCCUPANCY_TYPE_ABBREVIATIONS.get(occupancy_type)
occupancy_id = parsed_addr.get('OccupancyIdentifier')
if ((occupancy_id and not occupancy_id.startswith('#'))
and not occupancy_type_abbr):
occupancy_type_abbr = default
...
When I step debug, the returned occupancy_type is 'APT'. However, the occupancy_type_abbr is set as None considering the OCCUPANCY_TYPE_ABBREVIATIONS are currently in the format <full_name> -> <abbreviation>
I suggest the following fix:
if occupancy_type in OCCUPANCY_TYPE_ABBREVIATIONS.values():
occupancy_type_abbr = occupancy_type
else:
occupancy_type_abbr = OCCUPANCY_TYPE_ABBREVIATIONS.get(occupancy_type)
HI there, I appreciate your work on normalizing the US Address library!
I notice the following issue:
I believe it has to do with the following bit of code in
scourgify.normalize.normalize_occupancy_type
:When I step debug, the returned
occupancy_type
is'APT'
. However, theoccupancy_type_abbr
is set asNone
considering theOCCUPANCY_TYPE_ABBREVIATIONS
are currently in the format<full_name> -> <abbreviation>
I suggest the following fix:
Then you can cover both cases.