epiforecasts / socialmixr

R package for deriving social mixing matrices from survey data.
http://epiforecasts.io/socialmixr/
Other
38 stars 11 forks source link

numeric artefacts when sub-groups sizes are very different. #145

Closed lwillem closed 1 week ago

lwillem commented 4 weeks ago

When sub-groups are very different in size, normalization to obtain a symmetric matrix can result in unexpected artefacts. For example:


vietnam_survey <- get_survey("https://doi.org/10.5281/zenodo.1289473") contact_matrix(survey = vietnam_survey, age.limits = c(0,90), estimated.contact.age = 'sample', symmetric = FALSE)$matrix

[0,90) 90+
[0,90) 7.6 0.1
90+ 5.0 1.0

contact_matrix(survey = vietnam_survey, age.limits = c(0,90), estimated.contact.age = 'sample', symmetric = TRUE)$matrix

[0,90) 90+
[0,90) 7.6 0.06
90+ 22 1.0

It is suggested to include a warning when the normalisation factor exceeds a threshold, e.g., 2 (or make this threshold a function parameter). Large differences in the size of the sub-populations with the current age breaks are likely to result in artefacts after making the matrix symmetric. The user should reconsider the age breaks to obtain more equally sized sub-populations.

lwillem commented 4 weeks ago

see pull request #146