CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.11k stars 18.39k forks source link

I created a CLEANED dataset that combines cases, recoveries, and deaths to 1 CSV file #1241

Open jbarton311 opened 4 years ago

jbarton311 commented 4 years ago

This is not an issue perse, but I wanted to share this info with the community.

I've created a SINGLE, cleaned dataset that includes ALL cases, recoveries, and deaths into a single CSV file. It also:

I believe it is in an easier to use format than the CSVs posted to this repo (especially when performing analysis with BI tools).

The dataset along with more info (and the code used to create it) can be found here

Thank you JHU for all of the hard work here!

valeriupredoi commented 4 years ago

very cool! I am already using JHU data for linear analyses here - might switch to yours soon if that's okay with you :beer:

rufuspollock commented 4 years ago

Cool. We did this a couple of weeks ago for Open Data Day and repo is here (details in https://www.datopian.com/blog/2020/03/17/odd-covid-19/):

https://github.com/datasets/covid-19

Maybe we can join forces?

There's also json data from DataHub dataset we are publishing here https://datahub.io/core/covid-19#data

jbarton311 commented 4 years ago

I should have also mentioned - I have leveraged this dataset to build a simple dashboard using Google Data Studio.

paolinic03 commented 4 years ago

Hey @jbarton311 so when you say the dataset contains all recoveries, why does your simple dashboard show no recoveries? Not seeing the difference. Thanks.

jbarton311 commented 4 years ago

Hi @paolinic03 ,

There are currently issues with the JHU data for US recoveries. Once those are fixed, the dashboard will display proper US recoveries data. Here is a sample of what data is contained in my dataset. You'll see several recoveries columns.

paolinic03 commented 4 years ago

@jbarton311 ok makes sense. Appreciate it. Good job with your dashboard. Like the look.

jbarton311 commented 4 years ago

FYI - based on the recent announcement from JHU, I will have to modify what I've done to match their new data moving forward. Output data format will have to change slightly.

joshp112358 commented 4 years ago

I made this based on the dataset. It does some basic times series visualisations.

https://joshyp.shinyapps.io/COVID_VIZ/

verayanakieva commented 4 years ago

Hi,

Do you plan to continue updating the data (combined incl. recovered)?

https://datahub.io/core/covid-19#data

It'd be great if that's the case.

jbarton311 commented 4 years ago

@verayanakieva

I plan to continue to update the data but am waiting for JHU to come out with the US specific datasets that they mentioned they'd be releasing. I will not being including recovered as they are no longer tracking that metric.

valeriupredoi commented 4 years ago

@verayanakieva

I plan to continue to update the data but am waiting for JHU to come out with the US specific datasets that they mentioned they'd be releasing. I will not being including recovered as they are no longer tracking that metric.

I think the changes are out since yesterday's daily dataset, mate - I had to change my data module in my code since since yesterday there's heaps more on individual cities in the US and also a total restructuring of the order in which members are in the table :beer:

rufuspollock commented 4 years ago

Do you plan to continue updating the data (combined incl. recovered)?

https://datahub.io/core/covid-19#data

Yes, we (@datasets / @datopian) plan to keep that updated.

BTW the github source is here https://github.com/datasets/covid-19

jbarton311 commented 4 years ago

@valeriupredoi - I am planning on waiting for them to release the time series CSVs for US which hopefully will be very soon. From what I hear, the county data in the daily CSVs only goes back so far.

@rufuspollock - where are you sourcing recoveries data from? Is it a reliable source of data? As I understood, this metric was not reliably tracked.

josebsalazar commented 4 years ago

Cool. We did this a couple of weeks ago for Open Data Day and repo is here (details in https://www.datopian.com/blog/2020/03/17/odd-covid-19/):

https://github.com/datasets/covid-19

Maybe we can join forces?

There's also json data from DataHub dataset we are publishing here https://datahub.io/core/covid-19#data

Do you plan to continue updating the data (combined incl. recovered)? https://datahub.io/core/covid-19#data

Yes, we (@datasets / @datopian) plan to keep that updated.

BTW the github source is here https://github.com/datasets/covid-19

Thanks, i was going to write my own python script, but I will leverage your work.