Closed mattyschell closed 3 years ago
Summary statistics that sorta indicate this interpretation of NULL melissa suites is correct:
654,702 -- Number of NULL suites in melissa data that have no matching address point in subaddress
1,994 -- Number of NULL suites in melissa data with a matching subaddress. This smallish number of subaddresses would be deleted
Counterpoint: Looking at a few of the 1,994 addresses they look like they really do have subaddress units and the melissa data is wrong.
Address point id 21158 with 32 subaddresses, Apt L1 through 4H, looks accurate. Why does melissa deliver this address as having no units?
Case 1: According to the experts the null melissa suites are the "base address" for a group of units and should be completely ignored.
Case 2: We will attempt to get the original melissa data prior to geocoding. It is possible that this is an error in geocoding and the melissa data includes the units. Overall, however, most signs point to ignoring NULL melissa suites in this case as well.
Case 2: These are not errors in geocoding, melissa data truly says there are no units for these addresses. I think there may be a combination of edge cases and bad data behind these 1,994 addresses where NYC has units and Melissa says nope.
NYC has units but they are unoccupied. See for example address point 58981, 11207 Queens Blvd. NYC says lots of units, Melissa says none, a quick glance at this location and its a big complex maybe no one lives in it?
NYC has units but they are mistakes, like lots of duplicate NULL units. See for example address point 10171675, 13536 Roosevelt Ave, where NYC has 7 NULL subaddresses associated with this address.
The experts suggest another line of inquiry - review Melissa deliveries from prior years to see if these same (or similar) 1,994 "case 2" addresses are without units in the deliveries.
After the 2020 update and some cleanup of errors in subaddress the count is down slightly to 1845. 1,845 melissa addresses indicate no units at an address where cscl says subaddresses do in fact exist.
Same pattern in the 2019 Melissa delivery. Most (close to all) of the same addresses appear without suites in the geocoded melissa data while very definitely having subaddresses in CSCL.
Below observe the facts without biases of the head or heart. Determine the arc's path, stroll leisurely to its terminus and the truth will fall at our feet.
Addresspoint id 2011474 with 35 subaddresses at 922 Southern Blvd, Bronx.
Addresspoint id 2001794 with 11 subaddresses at 2817 3rd Ave, Bronx.
Addresspoint id 6345 with 2 subaddresses at 3912 Crescent St, Long Island City.
Addresspoint id 127246 with 12 subaddresses at 13324 41st Ave, Flushing, Queens
Conclusion: NULL Melissa suites indicate one of several possibilities.
In any case processing NULL Melissa suites is, for now, not a task we can accomplish in this project.
The experts reviewed this issue on July 7 2021 and have a different take. When Melissa suites for an address point are NULL we should in fact delete the subaddresses in CSCL.
When buildings are demolished or have a type 1 alterations editors will remove the address point from CSCL and create a new address point.
In "an address point in transition" cases editors have no reason to touch the building footprint in our current staffing and procedural setup. When a building is gutted and rehabbed with a new set of subaddresses we are entirely reliant on Melissa data to tell us what to do with the associated subaddresses.
Final answer - when we encounter only a NULL suite in the geocoded melissa data with matching subaddresses in CSCL we should delete the subaddresses from CSCL. See new issue for implementation.
Determine the meaning of NULL melissa suites.
Case 1, probably ignore that 6th row. But why is it there? Does it suggest that there exists a valid unit-free addressable location here?
Case 2, does this mean delete any subaddresses associated with this address, if subaddresses exist? This is the only record for the address.