What about running it on JHU CSSE data?

Thanks, it's a good question.

So in general I prefer WHO data to JHU data because WHO uses the data by date of death as opposed to date of reporting, for some countries (mostly European) that share these data. Here is a figure from last year that illustrates that:

who_vs_jhu_daily

Look e.g. at Sweden: JHU has a lot of within-week fluctuations because Sweden reports fewer deaths on each weekend. But WHO uses data by date of death, so there is no within-week fluctuations at all. Clearly WHO data are better.

That said, Boudewijn Roukema pointed out to me recently that for some other countries JHU data are better and less jumpy because it seems that on some days WHO simply skipped one day and did not update the time series, so they report 0 deaths on that one day. One example is Algeria. Here is WHO:

Screenshot from 2022-03-15 12-07-26

While here is JHU (unsmoothed):

coronavirus-data-explorer(7)(1)

Here clearly JHU has a better data.

My take on that is that the best approach may be be to grab both data sources and then choose the less noisy source for each country... But I have not tried that yet.

Regarding sub-country-level data, I did the analysis on US states and Russian federal regions (see my Python notebook), grabbing the data from CDC and Russian authorities directly.

I am re-opening this issue, because I think it'd be great to run it on JHU and see what happens.

dkobak / covid-underdispersion

What about running it on JHU CSSE data? #1