gluap / covid_micro

quickly hacked webservice to analyze corona virus case numbers (fit exponential to find doubling time) and plot
0 stars 2 forks source link

Time travel in South Africa :-) #2

Open ThimoNeubauer opened 4 years ago

ThimoNeubauer commented 4 years ago

With todays data the "T_2 over X days" plots looks funny:

image

I guess it's yet another special case in the input data...

gluap commented 4 years ago

The cause is the following: All but the last point come from a data table labeled only with the calendar date. Because the time when the number of cases was "Measured" is unknown but most (big) countries report multiple times per day, it is assumed that the number of cases labelled 2020-03-23 is actually a value close to midnight in the evening of 2020-03-23 thus really "2020-03-24 00:00". The last point comes from a realtime API that actually delivers a time stamp when the point was taken. As we see South Africa hasn't delivered data since yesterday ~11am. Because that last point is timestamp-aware and the others are not it shows this "special" behaviour until the first data value on the new day is reported.

I've been thinking about adding a plausibility check to address the issue (essentially discarding the realtime point if it's older than the latest from the time series) but so far it wasn't pressing enough.