Closed bukosabino closed 4 years ago
Hi Dario,
Thanks for raising this issue, I'm sorry but in that case I think that you are mistaken.
Since we compute the proportion of population in each age group, dealing with thousands or units doesn't matter. You don't find the same result in your calculation because you made a mistake: by just removing the dot you ignore that in some cases the 0 was ignored if it was the last digit. E.g. in Italy, by removing the dot the value for ages 5-9 goes from 2768.81 to 276881 instead of 2768810.
I think this is the source of the discrepancy but it would be certainly good to check by yourself!
Thanks again for your interest and please do not hesitate to communicate any bug or issue. Best, Julien
:O
I didn't see this last digit, and you are right, the proportion should be equal when we are working with thousands or millions.
Best, Dario
Hi @jriou,
I think there is an error calculating age distribution in your R experiment:
The problem is related to the format on
data/age_structure.xlsx
file. If you take a look to the xlsx file:I think you need to remove the dot of the values on the dataset to get the real age distribution results, because in R,
as.numeric
doesn't transform these numbers to integers:I can share my Python code to do the same:
First version, with the line commented to get the same results:
Second version, using the line to delete the dot and get the real age distribution:
You can find the error here:
https://github.com/jriou/covid_adjusted_cfr/blob/9e041b8beac0c6a07b626a57e128f128f4e2872c/data/italy/data_management_italy.R#L13 https://github.com/jriou/covid_adjusted_cfr/blob/9e041b8beac0c6a07b626a57e128f128f4e2872c/data/south-korea/data_management_south_korea.R#L19
Let me know if you agree.
Best, Dario