Closed mqli322 closed 6 years ago
@mqli322 @AmandaEyer For 1) Would you like that flagged as a potential outlier? I'd like to avoid adding another field 2) duplicates.sql has the logic for how duplicates are being flagged. I added logic to remove spaces, but we're not adding logic to normalize address - if that'd be useful we can move this to after we geocode and get the preferred address from GeoClient 3) not an error
Added logic in qc_outliers where it'll flag a records WHERE dob_type = 'NB' AND (units_net_complete::integer - units_prop::integer) >= 50
I didn't do it by individual year for several reason 1) being the ability to maintain the code
I think we need to sit down together to figure this one out. I am not sure this portion of the code went through. For instance, 121324717 or 121186171 had more units receiving TCO's than unit_prop, and so have a negative change number in one year, which doesn't make sense. Basically, almost no NBs should have negative unit counts.
Closing - the original issues have been addresses and math is correct on these records
[x] Where CO of any given year is 50+ units greater than units proposed (NBs only)
[ ] Add to script for identifying possible dupes: if address field contains extra space or spelled using abbreviated version of avenue/street (for example, jobs 104869108 & 121331120)
[ ] If multiple jobs were geocoded to same lat long (for example, ~600+ addresses were geocoded to same lat long & same BBL (4163500400) via GBAT-A method)