davidbau / covid-19-chart

Chart of current COVID-19 time series data. Enables a variety of county- state- and nation-level comparisons and data exploration.
https://covid19chart.org/
18 stars 4 forks source link

Which data file(s) are being used to generate the stats? #19

Closed PySimpleGUI closed 4 years ago

PySimpleGUI commented 4 years ago

I would like to add onto the desktop application I built for worldwide data to include data about the individual states.

The CSV file I'm using for my current code is this time-series file: https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv

The entire dataset is in the single CSV file.

For this project's data, are you aggregating the data that is contained in the individual daily reports? https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports

It seems like that's the only way to get data at a state level.

davidbau commented 4 years ago

are you aggregating the data that is contained in the individual daily reports

Yes, See this code:

https://github.com/davidbau/covid-19-chart/blob/master/lib/load_csse.js

That is, every browser that visits the page does the aggregation itself. I hope CSSE starts publishing an aggregated time series with US breakdowns, because if this continues for 18 months, that will be a few hundred GET requests to fetch the data for each view. But that's what we do for now.

Feel free to use this for your own project, if you find it useful.

PySimpleGUI commented 4 years ago

Thank you very much for the info. VERY helpful

I added a comment onto an opened Issue where people are requesting a time series CSV.

https://github.com/CSSEGISandData/COVID-19/issues/1505

In addition to showing your graph as an example of why the data is needed, I made a suggestion that someone perhaps automate the creation of a CSV file based on the data and then post that CSV to another GitHub. If I had the skillz to do it I would, but I'm afraid a web guy I am not. I do desktop pretty well, but not web. My code is in Python and I don't know JS so I'm in a holding pattern until I can get a good data source going. Someone will do something I'm sure.

Hopefully either Johns Hopkins will aggregate the data or someone will write something and post the results daily. 🤞

PySimpleGUI commented 4 years ago

I learned from the other issue I posted that someone is now posting the aggregated data. I'm going to be using it to create state-granular time series like you're doing. I didn't have it in me to aggregate all the data from so many files. It's bad enough dealing with summing up the county granular data into state-level.

It's risky to be once removed from the actual source data, but the Johns Hopkins data is also one-level removed from their WHO, CDC and other sources.

Here's the location of the data. I think (maybe?) it may end up being more accurate in some ways even though it's aggregated. The more people are looking at the data, the more eyeballs there are on it, the more chances errors can be detected and corrected.

https://www.soothsawyer.com/john-hopkins-time-series-data-confirmed-case-csv-after-march-22-2020/?github=pysimplegui

Your graph is aggregated and I trust that you're looking at the underlying data making sure it's good. Your dashboard is my new go-to page for understanding how the USA is doing in the fight. I like my world-wide country tracking, but for the internal USA data, you've got the right format.