timriffe / covid_age

COVerAGE-DB: COVID-19 cases, deaths, and tests by age and sex
Other
56 stars 30 forks source link

Italian Output #6

Open mpascariu opened 4 years ago

mpascariu commented 4 years ago

Checking the Italian output i can see:

   Country Region             Code       Date Sex Age AgeInt   Cases Deaths
1    Italy    All ITinfo15.04.2020 2020-04-15   b   0     10  1008.8    1.0
2    Italy    All ITinfo15.04.2020 2020-04-15   b  10     10     0.0    0.0
3    Italy    All ITinfo15.04.2020 2020-04-15   b  20     10  7061.9    7.0
4    Italy    All ITinfo15.04.2020 2020-04-15   b  30     10 13114.9   39.0
5    Italy    All ITinfo15.04.2020 2020-04-15   b  40     10 19392.1  173.0
6    Italy    All ITinfo15.04.2020 2020-04-15   b  50     10 30103.7  746.0
7    Italy    All ITinfo15.04.2020 2020-04-15   b  60     10 23808.5 2242.1
8    Italy    All ITinfo15.04.2020 2020-04-15   b  70     10 25532.0 6074.3
9    Italy    All ITinfo15.04.2020 2020-04-15   b  80     10 26097.4 7890.4
10   Italy    All ITinfo15.04.2020 2020-04-15   b  90     10  6907.3 1359.0
11   Italy    All ITinfo15.04.2020 2020-04-15   b 100     10  2440.5  976.1

When Age == 10 the number of cases is 0 (unlikely!). I think the cases here are estimated based on the death records, information which is taken from daily summarized info-graphics.

However, note that on a weekly basis complete information by age and sex is provided (see April 16). The modeled daily data should be consistent with the complete weekly data at the nearest calendar date ...or removed.

Also if data is already reported in the desired age classes, I don't think any altering of the data should be applied. This is the case for the number of deaths displayed above. Only the deaths in the last reported age group 90+ should be split between 90 and 100. For all the other ages should be read as reported. I know this complicates a bit the coding but would be the right way to do it.

timriffe commented 4 years ago

The issue is that case is inferred from deaths and rounded case fatality. We need to find a way to reconcile with the Bollettino data, tbd.

On Mon, Apr 20, 2020 at 11:50 AM Marius D. Pascariu < notifications@github.com> wrote:

Checking the Italian output i can see:

Country Region Code Date Sex Age AgeInt Cases Deaths1 Italy All ITinfo15.04.2020 2020-04-15 b 0 10 1008.8 1.02 Italy All ITinfo15.04.2020 2020-04-15 b 10 10 0.0 0.03 Italy All ITinfo15.04.2020 2020-04-15 b 20 10 7061.9 7.04 Italy All ITinfo15.04.2020 2020-04-15 b 30 10 13114.9 39.05 Italy All ITinfo15.04.2020 2020-04-15 b 40 10 19392.1 173.06 Italy All ITinfo15.04.2020 2020-04-15 b 50 10 30103.7 746.07 Italy All ITinfo15.04.2020 2020-04-15 b 60 10 23808.5 2242.18 Italy All ITinfo15.04.2020 2020-04-15 b 70 10 25532.0 6074.39 Italy All ITinfo15.04.2020 2020-04-15 b 80 10 26097.4 7890.410 Italy All ITinfo15.04.2020 2020-04-15 b 90 10 6907.3 1359.011 Italy All ITinfo15.04.2020 2020-04-15 b 100 10 2440.5 976.1

When Age == 10 the number of cases is 0 (unlikely!). I think the cases here are estimated based on the death records, information which is taken from daily summarized info-graphics https://www.epicentro.iss.it/en/coronavirus/bollettino/Infografica_15aprile%20ENG.pdf .

However, note that on a weekly basis complete information by age and sex is provided (see April 16 https://www.epicentro.iss.it/coronavirus/bollettino/Bollettino-sorveglianza-integrata-COVID-19_16-aprile-2020.pdf). The modeled daily data should be consistent with the complete weekly data at the nearest calendar date ...or removed.

Also if data is already reported in the desired age classes, I don't think any altering of the data should be applied. This is the case for the number of deaths displayed above. Only the deaths in the last reported age group 90+ should be split between 90 and 100. For all the other ages should be read as reported. I know this complicates a bit the coding but would be the right way to do it.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/timriffe/covid_age/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG43G5SHA2JCWSOZTFVIQ3RNQLGFANCNFSM4MMJL3DQ .

Nicefyy commented 3 years ago

Hello,why is there a decimal in the death toll?

timriffe commented 3 years ago

There are 3 things (that I can think of right now) that can lead to decimals in estimated/harmonized counts.

  1. counts of unknown age are redistributed proportionally to counts of known age
  2. if marginal totals are captured, then age-specific counts are rescaled to sum properly
  3. when ages are split from 10 to 5-year age groups a statistical model is used

We round output to one decimal place rather than to integers in part because it makes users notice this point. This is in keeping with practice at other standardized databases, such as the Human Mortality Database.

On Mon, Sep 28, 2020 at 11:14 AM Nicefyy notifications@github.com wrote:

Hello,why is there a decimal in the death toll?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/timriffe/covid_age/issues/6#issuecomment-699885091, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG43G7KPROCYPHNPM6I4RDSIBHWTANCNFSM4MMJL3DQ .