covid19datahub / COVID19

A worldwide epidemiological database for COVID-19 at fine-grained spatial resolution
https://covid19datahub.io
GNU General Public License v3.0
251 stars 93 forks source link

articles/iso/USA #122

Closed utterances-bot closed 3 years ago

utterances-bot commented 3 years ago

United States • COVID-19 Data Hub

https://covid19datahub.io/articles/iso/USA.html

Inglezos commented 3 years ago

Hello, I noticed that the number of recovered cases are significantly lower than the number the worldometers.info reports -> https://www.worldometers.info/coronavirus/country/us/ Why is this happening? Can you fix it?

Inglezos commented 3 years ago

The difference is of ~2.3 millions Also there is a significant difference in the cases as well ~250.000.

eguidotti commented 3 years ago

Hello, the data you refer to are provided by Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) and they seem coherent with the historical data they provide: https://github.com/CSSEGISandData/COVID-19 Would you mind notifying the issue to them? Any change by JHU CSSE is automatically reflected in the data hub

Inglezos commented 3 years ago

I have already contacted the JHU systems team by email (at jhusystems@gmail.com) regarding this wrong data issue, but they still have not replied me back. Please see my detailed solution-comment at https://covid19datahub.io/articles/iso/GRC.html

Inglezos commented 3 years ago

You can compute the historical recovered cases for USA and Greece, as well as for any other country, using the Active Cases, according to https://www.worldometers.info/coronavirus/about/ -> Definitions section -> Active Cases = (total cases) - (total deaths) - (recovered) Thus: Total Recovered = Total Cases - Total Deaths - Active Cases Then, for the past daily recovered cases you could proceed with: Daily Recovered[i] = Total Recovered[i] - Total Recovered[i-1] and do this recursively since the first days of the pandemic for the specific country.

For example, for USA and for yesterday (23 October 2020), this is valid as well, since: Total Cases = 8.746.953 Total Deaths = 229.284 Active Cases = 2.819.508 Thus Recovered Cases = 8.746.953 - 229.284 - 2.819.508 = 5.698.161, which is the same value as the one worldometers.info has (almost, because since yesterday there are additional recovered cases reported).

I think this is something that can be done by your team very soon and will solve many inconsistencies at the recovered cases. This has to be applied at least for USA and Greece, as far as I know. If I find any other country I will contact you again!

eguidotti commented 3 years ago

Hello @Inglezos, many thanks for your detailed answer and efforts! Before fixing this, I'd need some further explanation on the solution proposed.

I can compute today's Total Recovered = Total Cases - Total Deaths - today's Active Cases However, it is not clear to me how to retrieve the historical Total Recovered. One solution would be to compute the daily recovered from the data we have by JHU CSSE and then cumulate them back in the past. But this means that we are basically changing only the level of Total Recovered and we are assuming that daily recovered by JHU CSSE are correct and at this point I'm not very sure about this. Do you happen to know some source for historical daily recovered for cross-checking?

Another alternative would be to use Total Recovered = Total Cases - Total Deaths - Active Cases if some source for historical active cases is available. Are you aware of such a provider? I can see a plot of historical active cases here https://www.worldometers.info/coronavirus/country/us/ however it seems to me that worldometers is not open sourcing the data and web scraping would probably be against their terms of use. I see here they were/are even selling a license just to place their counters on websites.

Inglezos commented 3 years ago

I was suggesting only the second solution you mention. I meant to extract the historical active cases data from either JHU or worldometers, yes. I don't know if this is available somewhere else. And you are probably right, web scraping would require a license from worldometers, maybe...

Can you think of any other solutions? I mean that we know for sure that these data are wrong and something has to be done, until JHU fixes this or at least responds back to inform us. Please send JHU an email by yourself too, maybe you have a better luck to get a reply from them (I guess they are busy, but this is urgent and has to be solved soon).

It would be difficult the covid19datahub team to get a license, in the worst case, from worldometers? Can you send them an email as well to ask what can be done from your side and inform me?

eguidotti commented 3 years ago

Unfortunately I don't think we can get a license from worldometers as the project is 100% open source and the data as well. Moreover we are all volunteers here to maintain the data hub. I just opened an issue at JHU CSSE. Fingers crossed

Inglezos commented 3 years ago

Yes of course you are right, we cannot do that, we have to keep this 100% open source. Let's hope this is addressed by the JHU CSSE team soon! If you have any news/updates please inform me either here or at my personal email: inglezos@ece.auth.gr, thank you very much for your time!

Inglezos commented 3 years ago

I just found https://www.coronatracker.com/. Could you please check this out as a data source and contact them?

eguidotti commented 3 years ago

I checked but they rely on 3rd party data: From https://www.coronatracker.com/analytics I see:

Sources: WHO, CDC, ECDC, NHC of the PRC, JHU CSSE, DXY, QQ, and various international media

Let's wait for JHU CSSE, it seems they are working on it

Inglezos commented 3 years ago

The JHU CSSE team has updated and closed this issue. Thank you for your support!