H4kor / end-of-covid

Rough estimations of the end of covid19
https://h4kor.github.io/end-of-covid/
7 stars 0 forks source link

Display Uncertainty #1

Closed Sogolumbo closed 3 years ago

Sogolumbo commented 3 years ago

Please also display the uncertainty of your calculation. I know it's pretty hard to find a meaningful procedure here. The estimation of the uncertainty does not have to be perfect if you explain which kind of (statistical) errors go into your calculation and which (systematic) errors are not taken into account (amount of testing, delayed test reporting due to holidays, ...).

I'm not really sure, how to calculate the statistical uncertainty in this case. Here are two (more or less problematic) ideas:

  1. Get the relative uncertainty of the daily cases of the latest 7 days [numpy.std(daily_cases)/numpy.mean(daily_cases)]. Apply this relative uncertainty to your weekly case number and propagate the error with your calculation. This uncertainty might be too high during normal weeks without any special holidays but more accurate when there are many holidays.
  2. You could also "simply" get the uncertainty of the 7 day cases by calculating those e.g. from 0 to 7 days ago (The problem here might be that the values are correlated due to them just being sums of the daily cases.) or one and two weeks ago. This calculation of the uncertainty extends over a pretty big time span where a significant change of the infections can be expected. Changing infection rates will lead to a higher uncertainty which does make sense and should be communicated.

I just noticed that I wrote all of this thinking about the uncertainty of the infection rate when we need the uncertainty of the change if the infection rate. So my suggestions are wrong but as the thoughts and ideas are still relevant I'll leave this here.

Sogolumbo commented 3 years ago

Side note: you compare [n-1] to [n-7]. Those are different days of the week. Was that intentional, why?

To calculate the uncertainty I suggest: S(t): infections per day t n: today/now i: shift Difference of the infection rate with one week of time difference: d(i) = S(n-i-1) - S(n-i-8)

Now wecalculate the mean 7 day difference of the past week: D_mean = np.mean(d([0:6])) And it's statistical uncertainty: D_std = np.std(d([0:6])) Now you only need to propagate the error to your result. Ask me for help if you need some.

H4kor commented 3 years ago

I'm looking at the active cases, instead of the infection rate. This should be less volatile than the infection rate (but still volatile). I'm looking at a 1 week diff (lastDate is calculated from nowDate).

Looking at multiple changes and giving an uncertainty is a good idea, I will looking into it.

Sogolumbo commented 3 years ago

Thanks, I like the visual implementation.