Closed damonmcc closed 4 months ago
Update steps (old issue template)
Like most of our data products, source data must be updated in data library before FacDB is run. As there are are many source datasets with varied update processes, this issue template should be opened to track progress towards updating all source data
All source data listed is to be uploaded as .csv files
version.env
file[x] bpl_libraries Source: Scraped from BPL website Source url: https://www.bklynlibrary.org/locations/json
[x] nypl_libraries Source: Scrape from NYPL website Source url: https://www.nypl.org/locations/list
[x] uscourts_courts Source: Court locator for NY state Source url: http://www.uscourts.gov/court-locator/city/New%20York/state/NY
To see if a dataset needs to be uploaded, check date last updated in open data against version in data library
[x] dca_operatingbusinesses https://data.cityofnewyork.us/Business/Legally-Operating-Businesses/w7w3-xahh
[x] dcp_colp https://www1.nyc.gov/site/planning/data-maps/open-data.page#city_facilities
[x] dcp_facilities_with_unmapped https://www1.nyc.gov/site/planning/data-maps/open-data/dwn-selfac.page
[x] dcla_culturalinstitutions https://data.cityofnewyork.us/Recreation/DCLA-Cultural-Organizations/u35m-9t32
[x] dfta_contracts https://data.cityofnewyork.us/Social-Services/DFTA-Contracts/6j6t-3ixh
[x] doe_busroutesgarages https://data.cityofnewyork.us/Transportation/Routes/8yac-vygm
[x] sca_enrollment_capacity https://data.cityofnewyork.us/Education/Enrollment-Capacity-And-Utilization-Reports-Target/8b9a-pywy
[x] dohmh_daycare https://data.cityofnewyork.us/Health/DOHMH-Childcare-Center-Inspections/dsg6-ifza
[x] dpr_parksproperties https://nycopendata.socrata.com/Recreation/Parks-Properties/enfh-gkve NOTE: DPR open data table URLs are not consistent. Be sure to double-check before running from the recipes app.
[x] dsny_garages https://data.cityofnewyork.us/Environment/DSNY-Garages/xw3j-2yxf
[x] dsny_specialwastedrop https://data.cityofnewyork.us/Environment/DSNY-Special-Waste-Drop-off-Sites/242c-ru4i
[x] dsny_donatenycdirectory https://data.cityofnewyork.us/Environment/DSNY-DonateNYC-Directory/gkgs-za6m
[x] dsny_leafdrop https://data.cityofnewyork.us/Environment/Leaf-Drop-Off-Locations-in-NYC/8i9k-4gi5
[x] dsny_fooddrop https://data.cityofnewyork.us/Environment/Food-Scrap-Drop-Off-Locations-in-NYC/if26-z6xq
[x] dsny_electronicsdrop https://data.cityofnewyork.us/Environment/Electronics-Drop-Off-Locations-in-NYC/wshr-5vic
[x] dycd_afterschoolprograms https://data.cityofnewyork.us/Education/DYCD-after-school-programs/mbd7-jfnc
[x] fdny_firehouses https://data.cityofnewyork.us/Public-Safety/FDNY-Firehouse-Listing/hc8x-tcnd
[x] hhc_hospitals https://data.cityofnewyork.us/Health/Health-and-Hospitals-Corporation-HHC-Facilities/f7b6-v6v3
[x] hra_jobcenters https://data.cityofnewyork.us/Business/Directory-Of-Job-Centers/9d9t-bmk7
[x] hra_medicaid https://data.cityofnewyork.us/City-Government/Medicaid-Offices/ibs4-k445
[x] hra_snapcenters https://data.cityofnewyork.us/Social-Services/Directory-of-SNAP-Centers/tc6u-8rnp
[x] moeo_socialservicesitelocations https://data.cityofnewyork.us/City-Government/Verified-Locations-for-NYC-City-Funded-Social-Serv/2bvn-ky2h
[x] nycha_communitycenters https://data.cityofnewyork.us/Social-Services/Directory-of-NYCHA-Community-Facilities/crns-fw6u
[x] nycha_policeservice https://data.cityofnewyork.us/Housing-Development/NYCHA-PSA-Police-Service-Areas-/72wx-vdjr
[x] nysdec_solidwaste https://data.ny.gov/Energy-Environment/Solid-Waste-Management-Facilities/2fni-raj8
[x] nysdoh_healthfacilities https://health.data.ny.gov/Health/Health-Facility-General-Information/vn5v-hh5r
[x] nysdoh_nursinghomes https://health.data.ny.gov/Health/Nursing-Home-Weekly-Bed-Census-Last-Submission/izta-vnpq
[x] nysomh_mentalhealth https://data.ny.gov/Human-Services/Local-Mental-Health-Programs/6nvr-tbv8
[x] nysopwdd_providers https://data.ny.gov/Human-Services/Directory-of-Developmental-Disabilities-Service-Pr/ieqx-cqyk
[x] nysparks_historicplaces https://data.ny.gov/Recreation/National-Register-of-Historic-Places/iisn-hnyv
[x] nysparks_parks https://data.ny.gov/Recreation/State-Park-Facility-Points/9uuk-x7vh
[x] qpl_libraries https://data.cityofnewyork.us/Education/Queens-Library-Branches/kh3d-xhq7
[x] sbs_workforce1 https://data.cityofnewyork.us/dataset/Center-Service-Locations/6smc-7mk6
[x] usdot_airports https://hub.arcgis.com/datasets/usdot::airports Head to url >> api >> copy url from geojson
[x] usdot_ports https://hub.arcgis.com/datasets/usdot::ports Head to url >> api >> copy url from geojson
[x] nysdec_lands http://gis.ny.gov/gisdata/inventories/details.cfm?DSID=1114
These don't report date updated as neatly as the open datasets, have to look at data itself
[x] fbop_corrections https://www.bop.gov/locations/list.jsp When searching by state, there should be 5 NY prisons, 3 of which are in NYC (Brooklyn/New York)
[x] nycdoc_corrections https://www1.nyc.gov/site/doc/about/facilities-locations.page Source: NYCDOC locations directory
[x] nycourts_courts http://www.nycourts.gov/courts/nyc/criminal/generalinfo.shtml#BRONX_COUNTY
[x] nysdoccs_corrections https://doccs.ny.gov/find-facility Hand check for 1 facility in queens, 1 facility in Manhattan, 0 in the other 3 boros. Only look at the correctional facility locations, not the offices.
[x] doe_lcgms https://data.cityofnewyork.us/Education/LCGMS-DOE-School-Information-Report/3bkj-34v2 This dataset is updated for CEQR
[x] foodbankny_foodbanks http://www.foodbanknyc.org/get-help/
navigate to the map and make a copy of the map
After making a copy, click on the three dots next to the target layer and click "Export Data" and export as a csv
Rename the file (still as a csv) to match Food_Bank_For_NYC_Open_Members_as_of_DATE(YYYYMMDD). You will need to convert the existing date format MMDDYY to YYYYMMDD so that the version matches existing date format standard in data library.
place it at the library/tmp folder
then run library archive --name foodbankny_foodbanks with the -version flag set to the DATE in the file path url: "http://www.foodbanknyc.org/get-help/" dependents: []
place it at the library/tmp folder
then run library archive --name foodbankny_foodbanks with the -version flag set to the DATE in the file path
[x] nysed_activeinstitutions https://eservices.nysed.gov/sedreports/list?id=1 Active Institutions with GIS coordinates and OITS Accuracy Code - Select by County__ CSV. Note that .csv data is automatically downloaded without comma delimiter. Exporting to csv from numbers is one way to get around this issue. (Exporting as an xls and converting to a csv is also an option)
[x] nysoasas_programs https://webapps.oasas.ny.gov/providerDirectory/index.cfm?search_type=2 Download all treatment providers Modify download URL to contain today’s date: https://webapps.oasas.ny.gov/providerDirectory/download/Treatment_Providers_OASAS_Directory_Search_13-Nov-20.csv
[x] usnps_parks https://irma.nps.gov/DataStore/Reference/Profile/2225713 NOTE: the final number in the URL (2225713) is not always stable. If the data is missing, search through the home.
dfta_contracts
:
id
has changed.usnps_parks
:
usnps_parks
template in data library: EPSG:4269
doe_universalprek
:
Type
column changed but there appears 1-1 mapping between old and new values. Proposed mapping (old -> new):
Charter
-> Charter
(no change)PKS
-> PKS
(no change)DOE
-> Public School
NYCEEC
-> CBO
@AmandaDoyle - dycd_afterschoolprograms seems to not exist in socrata, and we haven't updated it since 2017.
This seems like a potential replacement, but looks potentially similar to dfta_contracts in that the dataset has changed dramatically
@alexrichey what did you do for usdot_airports, usdot_ports, and nysdec_lands in #51 ?
Following up on dfta_contracts
source data.
As mentioned previously, there have been changes in new dfta_contracts
version.
1) Column names have changed and there are more columns in the new version overall. We use only three columns in FacDB builds which are present in the new version. There column mapping (old --> new) is the following:
* `contract_type` --> `providertype`
* `provider_id` --> `dfta_id`
* `program_address` --> `programaddress`
2) contract_type
column values have changed. This column is used in our build to classify data_contracts
records with our own categories for factype
column as Senior Center
, Senior Services
, and Home Delivered Meals
. Note, the Home Delivered Meals
category used to come from the contract_type
column (I.e. we just kept the value from the table instead of creating our own).
There 15 categories in old versions and 12- in the new version. Below are the categories with their corresponding total record counts:
Home Delivered Meals
: "HOME DELIVERED MEAL SERVICE CONTRACTS", "CITY MEALS ADMINISTRATIVE SERVICES CONTRACTS"Senior Center
: "OLDER ADULT CENTER CONTRACTS ", "NATURALLY OCCURING RETIREMENT COMMUNITY CONTRACTS", ""Senior Services
: all other categories.Home Delivered Meals
or do we want to name it something different?Senior Services
. In the proposed mapping, it's Senior Center
. Does it sound okay?
nycha_communitycenters
so norc maybe should go to that rather than "Senior Centers". cc: @fvankrieken
I'm investigating how we'll update use of dycd_afterschoolprograms
the original socrata dataset is gone. in the likely replacement dataset named DYCD Program Sites here afterschool programs are a subset of the records. likely to suggest we consider this a new dataset to start archiving and using
@fvankrieken
yup! @sf-dcp figured it out locally so then I made changes in #546 and ran in CI
edit: to clarify, the gdb was ingested by data-library and converted to our 3 default formats
noting though that a *.gdb
file is pretty much a folder with lots of files in it. I'm not sure if it works with a zipped version of that folder, but that'd be nice
@sf-dcp
In the new version, there are 2 meal categories and both of them are named differently when compared to the old one. Do we want to assign them to the old value Home Delivered Meals or do we want to name it something different?
The current logic is WHEN contract_type LIKE '%MEALS%' THEN initcap(contract_type)
If the record with provider type that is now CITY MEALS ADMINISTRATIVE SERVICES CONTRACTS was HOME DELIVERED MEALS in the previous version of the data then yes, assign both "HOME DELIVERED MEAL SERVICE CONTRACTS" and "CITY MEALS ADMINISTRATIVE SERVICES CONTRACTS" to Home Delivered Meals
.
"OLDER ADULT CENTER CONTRACTS " should be categorized as Senior Center. Tell me more about the records that are "NATURALLY OCCURING RETIREMENT COMMUNITY CONTRACTS." (We can talk about this when we meet). What function do they provide, are they a location where people go?
@sf-dcp
In the new version, there are 2 meal categories and both of them are named differently when compared to the old one. Do we want to assign them to the old value Home Delivered Meals or do we want to name it something different?
The current logic is
WHEN contract_type LIKE '%MEALS%' THEN initcap(contract_type)
If the record with provider type that is now CITY MEALS ADMINISTRATIVE SERVICES CONTRACTS was HOME DELIVERED MEALS in the previous version of the data then yes, assign both "HOME DELIVERED MEAL SERVICE CONTRACTS" and "CITY MEALS ADMINISTRATIVE SERVICES CONTRACTS" toHome Delivered Meals
."OLDER ADULT CENTER CONTRACTS " should be categorized as Senior Center. Tell me more about the records that are "NATURALLY OCCURING RETIREMENT COMMUNITY CONTRACTS." (We can talk about this when we meet). What function do they provide, are they a location where people go?
Per further discussion with Amanda, there is only 1 CITY MEALS ADMINISTRATIVE SERVICES CONTRACTS record which appears to be a home delivery service. We will hardcode this contract to be classified as Home Delivered Meals
and add a note to the FacDB build template to validate additional CITY MEALS ADMINISTRATIVE SERVICES CONTRACTS records in the future. We don't want to classify the whole category as Home Delivered Meals
to avoid the future risk of misclassifying these records as the category is not 100% clear.
Regarding NORC, they seem to be of a service rather than a center type. See description of NORC programs here. Will classify NORC programs accordingly.
cc: @fvankrieken
Shared with GIS team for QA.
@croswell81 & @jackrosacker, just checking in to see if there’s any update on the review?
@sf-dcp No, but we plan to do this next sprint
Update: addressed issues from the second QAQC review and shared corrected outputs with GIS. Pending their review.
it's on Bytes and I'll distribute to Open Data
distributed to Open Data page here
updated metadata in https://github.com/NYCPlanning/product-metadata/pull/6
ran it locally so no action to link to, but could run again using the github action for tracking purposes
update: used github action here to push the same data and ensure test the metadata PR
Product Name
facilities
Build Version
24Q1
Status of Update