[x] #428 dca_operatingbusinesses has overagency in overabbrev
[x] #425 dcp_pops is missing facname and several other key fields
[x] #424 Incorrect script for hra_jobcenters:
name: hra_jobcenters
version: 20200901
scripts:
hra_snapcenters.sql
[x] #433 uscourts_courts is missing overagency and overlevel
[x] #427 nysdoh_nursinghomes is missing factype
[x] #433 nysparks_historicplaces is missing opname, opabbrev,optype,overagency,overlevel. Can we fill anything in?
[x] #429 sca_enrollmentcapacity needs to get filtered before join with doe_lcgms as it is here
Questions when comparing published version to development version.
[x] Why are there so many more schools seats and routes? Add / make sure that deduplication to doe_busroutesgarages is the same. (Seats might get resolved by #429)
[x] #435 Capitalization issue on FacType, Subgroup, Type.
[ ] Look into increase in records for LCGMS (#429), DayCare (Because of deduplication with doe_universalprek, see #410), NYCDOH_healthfacilities (#436), nysed_activeinstitutions, and nysomh_mentalhealth
[ ] Geocoding for FDNY, DCLA, MOEO and where the missing geometry is >= 15% and source is NYC
For FDNY: do we need to add the manual geometries into the manual_research table?
For MOEO: we have entirely new source data. Version-to-version comparison doesn't make sense. The name of the datasource has also changed, because of a typo in the previous version. As for the success of 1B addresses sometimes contain lists separated by ;. Do we take the first?
For DCLA: many addresses are PO Boxes. Those records are also missing lat/lon to backfill.
For DYCD: addresses are of the form "1537 Washington Avenue10457
(40.838299532463, -73.903023331464)" and so parsing tends to miss the "Avenue" section because there is no space between street and zipcode
For NYSOMH: source data is missing address info for 685 of 1540 records
For HHC: several streets are missing east or west
Questions but will not address:
[ ] Criteria for picking which record to deduplicate to improve stability (i.e. dohmh_daycare)
[ ] Why is property type only filled out for COLP?
Notes and potential bugs
overagency
,overabbrev
,overlevel
facsubgrp
,facgroup
,facdomain
,servarea
:overagency
inoverabbrev
facname
and several other key fieldsoveragency
andoverlevel
factype
opname
,opabbrev
,optype
,overagency
,overlevel
. Can we fill anything in?Questions when comparing published version to development version.
Questions but will not address: