Open pwinskill opened 10 months ago
Yes. The current implementation calls birth$get_index_of(a = timestep - upper, b = timestep - lower)
for each age group each timestep.
I'd guess a c++ implementation which computes all of the stats from all the births and intervals will save ~99 RCpp calls per iteration.
Found an R implementation for a personal project which is 6x faster feat/grid_render.
But it makes assumptions about the age groups which will be output (year wide age groups from 0 to 100). And some kinks need to be resolved, it doesn't output the same values as the existing method for some reason 🤔
Test, outputs and plots are attached below:
# Parameters that specify more output with a single age group
p_single_age <- get_parameters(
overrides = list(
human_population = 1000
)
) |>
set_equilibrium(
init_EIR = 5
)
p_single_age$prevalence_rendering_min_ages <- 0
p_single_age$prevalence_rendering_max_ages <- 100 * 365
# Parameters that specify output with multiple age groups
p_multiple_age <- p_single_age
p_multiple_age$prevalence_rendering_min_ages <- 0:99 * 365
p_multiple_age$prevalence_rendering_max_ages <- 1:100 * 365 - 1
p_grid <- p_single_age
p_grid$render_grid <- TRUE
set.seed(1)
system.time({
s_single_age <- run_simulation(
timesteps = 365 * 2,
parameters = p_single_age
)
})
set.seed(1)
system.time({
s_grid <- run_simulation(
timesteps = 365 * 2,
parameters = p_grid
)
})
jpeg('grid.jpg')
for (i in c(1, 50, 100)) {
if (i == 1) {
plot(s_grid$grid$timestep, s_grid$grid$n_detect[,i], type = 'l')
} else {
lines(s_grid$grid$timestep, s_grid$grid$n_detect[,i])
}
}
dev.off()
set.seed(1)
system.time({
s_multiple_age <- run_simulation(
timesteps = 365 * 2,
parameters = p_multiple_age
)
})
jpeg('multiple.jpg')
for (i in c(1, 50, 100)) {
if (i == 1) {
plot(
as.numeric(s_multiple_age$timestep),
as.numeric(s_multiple_age[[paste0(
'n_detect_',
p_multiple_age$prevalence_rendering_min_ages[i],
'_',
p_multiple_age$prevalence_rendering_max_ages[i]
)]]),
type = 'l'
)
} else {
lines(
as.numeric(s_multiple_age$timestep),
as.numeric(s_multiple_age[[paste0(
'n_detect_',
p_multiple_age$prevalence_rendering_min_ages[i],
'_',
p_multiple_age$prevalence_rendering_max_ages[i]
)]])
)
}
}
dev.off()
outputs:
root@c40d0e961b76:/opt# Rscript test.R
user system elapsed # single
4.994 0.000 4.995
user system elapsed # grid
6.935 0.000 6.910
user system elapsed # multiple
46.173 0.000 46.169
Multiple:
Grid:
It was an RNG issue
Specifying incidence of clinical or severe malaria disaggregated by age can slow down the model a lot. Having spoken with @giovannic, this is likely more to do with the implementation of estimating age-disaggregated outputs, rather than the overhead of rendering more output.
An example comparing a single age-band with 100 age bands: