aatishb / covidtrends

Tracking the growth of COVID-19 Cases worldwide
https://aatishb.com/covidtrends/
MIT License
301 stars 106 forks source link

Dataset mistake - Spain position #156

Open konrad44 opened 4 years ago

konrad44 commented 4 years ago

The graph says Spain had 411 new cases last week. Other sources say it's more had >20000 new cases. Clearly there is something wrong with the dataset.

rpkoller commented 4 years ago

what is the source for the >20k cases? https://www.worldometers.info ?

i took a look at the data from spain at johns hopkins dataset. two questions came up (latest value is on the right):

Bildschirmfoto 2020-05-01 um 20 50 15

  1. i thought the values there are cumulative values but you can see 213024 flanked by two smaller values. that is odd.

  2. if i calculate the confirmed cases of the last seven days for the spanish dataset. it is 213435-202990 which leads to 10.445? but the graph shows:

Bildschirmfoto 2020-05-01 um 20 54 13

cc @aatishb

aatishb commented 4 years ago

@rpkoller The weekly cases calculation seems to be correct.

Most recent cumulative cases - cumulative cases one week prior = 213435 - 213024 = 411

What's weird is the cumulative cases goes down from 213024 to 202990 the next day. Not much we can do about this as its a data issue, but worth looking into the reason.

rpkoller commented 4 years ago

@aatishb ok then i understood the calculation correctly (only made a counting mistake -> counted from day one 30.4. back to day seven 24.4 .... but actually it is 23.3.)....

But the problem is in that particular case the inconsistent data. take a look at the value on the 22nd (208389) and on the 24th (202990) both values are smaller than the value on the 23rd (213024). and since that is a cumulative value it should be 24th > 23rd > 22nd, but it is the other way around 24th < 23rd > 22nd. :/ looks like the spanish data source is inconsistent?

konrad44 commented 4 years ago

I'm not sure if this is the reason but on 19.04.2020 Worldometer data says that Spain new daily cases is NEGATIVE 1430. It looks like a human error in data input or some kind of correction.