nychealth / coronavirus-data

This repository contains data on Coronavirus Disease 2019 (COVID-19) in New York City (NYC), from the NYC Department of Health and Mental Hygiene.
https://www1.nyc.gov/site/doh/covid/covid-19-data.page
957 stars 649 forks source link

Questions from an undergrad working on a senior thesis!? #176

Closed critical-nomad closed 3 years ago

critical-nomad commented 3 years ago

Hi folks!!

I am a senior undergrad @ Columbia working on my thesis, and I was wondering if it would be at all possible for a few things: hospitalization by modzcta and death by modzcta over time. My thesis is focusing on mortality and the effects of income and race on it specifically in the city, and this data would be super helpful if it exists. I'm trying to scrounge it up, and its just not working. My method is a time series analysis of this and other metrics, and the crux of it lies with historical data. Thank you SO much!!

insidenothing commented 3 years ago

Have you through the commit history for older data?

critical-nomad commented 3 years ago

Okay I figured out what you mean and I have not, I am doing it now. Was there a particular update or date range that might have what I am looking for?

EDIT: OKAY! I found where you might be talking about, a commit on the 20th of December. Would it be possible to obtain this data through the present day, in a format that changes day-by-day?

EDIT 2: I found the file data-by-modzcta.csv in a 12/21 commit (it also exists in the present lol), and it has almost what I am looking for. It does not have death counts over time (like a weekly thing). Would I need to keep comparing updates y'all did and subtracting numbers from weeks prior, or is there a better way to do this? As for hospitalizations though, all is quiet and I cannot seem to find data that specific to area.

insidenothing commented 3 years ago

By hand is the best way, as time passed the data format changes a little, enough to mess with automation.

critical-nomad commented 3 years ago

Word, sounds good!! Is there hospitalization data for modzcta's?? Or is it too specific / laggy for it to be recorded more in depth than for the borough level?

insidenothing commented 3 years ago

That I don't know, maybe someone else does.

nygeog commented 3 years ago

@critical-nomad

I have the daily diff's - see here: https://nyc-covid-data-u46xbtaf6a-uk.a.run.app/nyc_covid but search for Issue's in this repo on DOH's explanations of issues w/ that methodology. They update and corrected data in the summer as data was reconciled.

I have a sqlite database here: https://nyc-covid-data-u46xbtaf6a-uk.a.run.app/nyc_covid.db

^ These URL's will change so grab this soon or ask me later for the most recent url of the Datasette app I'm hosting.

Here's the Data Dictionary: https://www.dropbox.com/s/lhiq8opfbsa0eh0/nyc_covid_data_dictionary.pdf

Just make sure you cite our work here at Mount Sinai compiling this data. Best of luck.

Also, as @insidenothing said, have not seen hospitalizations data at MODZTA.

mmontesanonyc commented 3 years ago

Thanks for the inquiry.

Per this repository’s readme, we cannot fulfill custom data requests. However, check back in this repository regularly, as we may continue to enhance our offerings and release new data files. And, issues like this help us understand what people are interested in tracking.

critical-nomad commented 3 years ago

Thank you so much guys for the help!! I got the nyc_covid.db file, and its AMAZING! However, it does not start tracking deaths or cases until like 07/12/2020. Is there any way I can find the data stretching back further into 2020?? Like around February maybe? I will 10000000% be citing you guys, this would not be possible without you!!

mmontesanonyc commented 3 years ago

Hi @critical-nomad, we have added hospitalization rate and death rate by modzcta, by month, since March 2020. Note that data are suppressed in instances of small counts; more notes are available in Trends/Readme.md.