starschema / COVID-19-data

Unpivoted and cleaned data sets on the COVID-19 pandemic
https://starschema.github.io/COVID-19-data
BSD 3-Clause "New" or "Revised" License
85 stars 19 forks source link

Add Active/Recovered for non-US Entries #78

Closed tfoldi closed 4 years ago

tfoldi commented 4 years ago

JHU stopped ingesting Recovered from 3/22 - we should stop adding those records to our datasets.

danteo13 commented 4 years ago

Is this still the case? I can see Recovered and Active data in the JHU CSSE Covid-19 repository.

tfoldi commented 4 years ago

This is a great point, as it seems they are updating the non-US recovery numbers. However, they specifically state they only update: time_series_covid19_confirmed_global.csv and time_series_covid19_deaths_global.csv

Please reference time_series_covid19_confirmed_global.csv and time_series_covid19_deaths_global.csv for the latest time series data.

https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series

I will cross-check their numbers with their sources to see if those numbers are really updated. If yes, we will move it back for global countries listed in that file.

danteo13 commented 4 years ago

Thanks for looking into this! I see that the first commit for time_series_covid19_recovered_global.csv is on March 25th, so maybe they forgot to update the message.

Looking at the data in more detail, it seems they have Recovered data for US and Canada, but only on Total Country level. In the daily reports, they managed this by adding a 'Recovered' province for US and Canada, though I'm not sure if this is the best solution. 'Active' is not filled in, but we can derive that from the other 3 metrics anyway.

saran88 commented 4 years ago

Hi, Are we still not updating the Recovered Cases? The file that I downloaded today did not have any values for Recovered. Please confirm

saran88 commented 4 years ago

@tfoldi , do you know when this issue with be fixed?

chrisvoncsefalvay commented 4 years ago

Hi @saran88 and others following this :)

In clinical virology, recovery is a complicated concept. A case is temporally unambiguous (either via testing or via clinical diagnosis), as is the demise of a patient. A recovery is not. For instance, recovery may be

Because of these vast differences, and the difficulty of structured follow-up in the absence of adequate testing capabilities for confirmation of recovery and the lack of follow-up as a priority given the number of cases, we have very low confidence in recovery data, past or present. This opinion is indeed shared by most of those working in the field. For now, therefore, we have opted against including recoveries.

As always, we seek to serve the wider community and we are happy to hear from all of you on the way you wish us to proceed. We're always open for suggestions, and for this reason, I'm leaving this ticket open to solicit further input from the wider user community.

saran88 commented 4 years ago

Hi @chrisvoncsefalvay ,

I do understand the point that recovery is a complicated metric. However, all the noticeable covid data providers like, JHU, ECDC is still publishing the daily recovered count.

Though not 100% accurate, when doing a data analytics, its worth looking at the Recovered cases also. without which we may not get a picture of current active cases that we are dealing with.

From a dateset perspective, you could still provide the numbers and its the discretion of the user to whether use it or not. This is my personal take on this issue.

Do let me know what your thoughts are.

Thanks!