Closed mgraber closed 4 years ago
@mgraber Here are HED's preferred field names. This list is provided in the same order as above, but the final output should have the fields listed in alphabetical order so the same group appears together.
invalid_date_lastupdt
okinvalid_date_filed
ok invalid_date_statusd
okinvalid_date_statusp
okinvalid_date_statusr
okinvalid_date_statusx
okbistest
-> z_bistest
null_units_initial
-> units_init_null
null_units_proposed
-> units_prop_null
large_alt_reduction
-> b_large_alt_reduction
large_nb
-> outlier_nb_500plus
large_demo
-> outlier_demo_20plus
greatest_alt_net_increase
-> outlier_top_alt_increase
greatest_alt_net_decrease
-> outlier_top_alt_decrease
dup_equal_units
okdup_diff_units
ok nonres_with_units
-> b_nonres_with_units
res_accessory
-> units_res_accessory
likely_class_b
-> b_likely_occ_desc
co_units_prop_mismatch
-> units_co_prop_mismatch
incomplete_tract
-> z_incomp_tract_home
inactive_with_update
-> z_inactive_with_update
In the QAQC document HED had a question about the function of the "co_latest_units is negative" test that appeared in the old qc_outlier code. Can DE explain? Is this checking to see if there are negative values listed on a CO?
I think I am personally fine with EDM removing the BISTEST records automatically. We put it in because it was in the QAQC checklist that we had been working off of. If this is already taken care of, however, I don't feel much need to look into it further.
I think I am personally fine with EDM removing the BISTEST records automatically. We put it in because it was in the QAQC checklist that we had been working off of. If this is already taken care of, however, I don't feel much need to look into it further.
@levysamu we remove them and record the records in the research table. this time around we can instead recording them in the qaqc table
@SPTKL @levysamu I think that we should continue to record the records in the QAQC table and remove them from the final output
@AmandaDoyle @mgraber @SPTKL Small revision on the logic for the b_likely_occ_desc
:
add this -> occ_init or occ_prop contains "hotel", "assisted", "incapacitated", "restrained", "dormitories" remove this -> job_type = Alteration and occ_initial contains “residential” and occ_proposed contains “hotel” OR remove this -> job_type = Alteration and occ_initial contains “hotel” and occ_proposed contains “residential” keep ->job_desc contains any of the following words ~* 'Hotel|Motel|Boarding|Hoste|Lodge|UG 5|Group 5|Grp 5|Class B|SRO|Single room|Furnished|Rooming unit|Dorm|Transient|Homeless|Shelter|Group quarter|Beds|Convent|Monastery|Accommodation|Harassment|CNH|Settlement|Halfway|Nursing home|Assisted|'
@AmandaDoyle @mgraber @SPTKL Here are the additional spatial checks we would like to perform:
geo_water
– Same as old code for dev_qc_water, identifying points that fall in water geo_taxlot
– Same as old code for dev_qc_taxlot, identifying points not located in a tax lot geo_null_latlong
– Output TRUE if the job is missing lat longgeo_null_boundary
– Output TRUE if any of the geographic boundary fields are null.We have a new idea for how to flag duplicates that we'd like to propose, given the request that we present potential duplicates as groups/clusters rather than pairwise comparisons. Rather than having one column for equal unit matches and another for different unit matches, we would like to propose:
duplicates with equal units and duplicates regardless of units
@AmandaDoyle @mgraber @SPTKL This is a great idea. Please do this!
Can you sign off on the New building and demolition overlap QAQC table schema under "Supplemental QAQC tables" in #106?
@AmandaDoyle @mgraber @SPTKL Signed off!
When finding potential duplicates, do we treat two records that both have NULL units as an "equal units" match? What about two records that share an address but both have NULL geo_bbl? (#106)
If the units are null, do not mark them as having equal units. If the addresses are the same, but the BBL is NULL, mark them as the same address.
QAQC table
Export alphabetically
qaqc_init.sql ( issue #104, PR #114 closed)
Records where dates are invalid
invalid_date_lastupdt
: The value indate_lastupdt
cannot be typecast as a dateinvalid_date_filed
: The value indate_filed
cannot be typecast as a dateinvalid_date_statusd
: The value indate_statusd
cannot be typecast as a dateinvalid_date_statusp
: The value indate_statusp
cannot be typecast as a dateinvalid_date_statusr
: The value indate_statusr
cannot be typecast as a dateinvalid_date_statusx
: The value indate_statusx
cannot be typecast as a datez_bistest
: 'BISTEST' is in job description or addressqaqc_units.sql (PR #117 #118 closed)
b_large_alt_reduction
: Alterations with more than five unit reductionoutlier_nb_500plus
: New buildings withclassa_prop > 499
outlier_demo_20plus
: Demolitions withclassa_init > 19
outlier_top_alt_increase
: The 20 largest alterations byclassa_net
outlier_top_alt_decrease
: The 20 smallest alterations byclassa_net
qaqc_status.sql (PR #122 closed)
z_inactive_with_update
: job is being changed to “Inactive” using corrections AND the date_lastupdt is after the last vintage date ( #35 )qaqc_mid.sql (PR #118 closed and PR #124 open)
[x]
dem_nb_overlap
: #57 Overlap between new buildings and demolitions using match ongeo_bbl
(PR #124)[x] #151 For
dup_bbl_address
&dup_bbl_address_units
do not filter to non-NULL BBL[x] ~
dup_equal_units
– job_type, BBL, address, and classa_net are all identical to another record, AND both jobs are not inactive ( #31 , PR #124)~ Replaced withdup_bbl_address
&dup_bbl_address_units
[x] ~
dup_diff_units
– job_type, BBL, address are identical to another record but classa_net are different AND both jobs are not inactive ( #31, PR #124 )~ Replaced withdup_bbl_address
&dup_bbl_address_units
Residential records have NULL unit fields
units_init_null
: job_type IN ('Demolition' , 'Alteration') AND resid_flag = 'Residential' AND classa_init IS NULLunits_prop_null
: job_type IN ('New Building' , 'Alteration') AND resid_flag = 'Residential' AND classa_prop IS NULL[X]
b_nonres_with_units
: Records whereocc_proposed
andocc_initial
suggest a non-residential record, but residential units are not zero ( #26 )Identify residential accessory structures
units_res_accessory
:Likely class B
b_likely_occ_desc
:Revision:
Units prop and CO mismatch
units_co_prop_mismatch
:job_type = 'New Building' AND classa_net_complt - classa_prop > 50
#224[X]
z_incomp_tract_home
: tracthome = Yes and NOT job_status LIKE 'Complete'Supplemental QAQC tables
~Potential duplicates~ REMOVE
New building and demolition overlap
Spatial QAQC #157
QAQC visualization
See issue #33