phac-nml-phrsd / reem

Respiratory infection forecaster
1 stars 0 forks source link

bug(s) when filling dates for `B` to match simulation horizon #20

Closed papsti closed 2 months ago

papsti commented 2 months ago

@davidchampredon, i've noticed one definite issue and one potential bug in the date filling functionality of check_B().

(possible bug) simulate.R#276 date.horiz = obj$prms$date.start + obj$prms$horizon

here, the simulation end date is being calculated as the start date + the horizon. should this not be start date - start delta + horizon? that's how i understood start delta to be used. am i wrong?

(definite bug) simulate.R#293:

if(obj$prms[['B']]$date[nb] < date.horiz){

this check occurs with the number of rows of B, nb, computed before the fill towards the start dates gets complete (if it does). nb should be recomputed, as on line 300 before line 293. otherwise, the warning message reports a last B date that is too early and will trigger even when the last B date is exactly the simulation end date (if there is any filling toward the start date). this then causes an error in line 303 when there is an attempt to compute d2 = seq.Date(obj$prms[["B"]]$date[nb] + 1, date.horiz, by = 1) (the first date will be after the second date so R will report that the sign of the by argument is wrong)

i can fix the second one myself and create a pull request, but i hesitate to fix the first one without clarifying with you whether it was intended or not.

davidchampredon commented 2 months ago

Regarding the "possible bug", the variable start.delta is used only when fitting (function reem_traj_dist_obs() in fit.R). The variable start.delta is used to update start.date and horizon before the simulation is run (within the fitting process). The function check_B() checks for B at the simulation step only (in reem_simulate_epi ()), so it is correct to ignore start.delta in the check. However, it should be more obvious that start.delta is used only during the fit. If the user wants to change the start date of the simulation, they should update prms$start.date directly. I will make that clear (new issue #21, thanks for pointing this out.

"definite bug": yes indeed a bug. Thanks! Fixed!

papsti commented 2 months ago

that makes sense about the start.delta, but what happens to B in fit simulations when start.delta is non-zero in the fit? is there another step where B[1] is filled to the new start date of date.start - start.delta and B[n] is filled to the new end date (if need be)? or is simulate_epi being called here too, so it's using the same code for filling in the fit sims?

davidchampredon commented 2 months ago

what happens to B in fit simulations when start.delta is non-zero in the fit?

simulate_epi is called inside reem_traj_dist_obs(), hence check_B() is called too, so B will be extrapolated.

is there another step where B[1] is filled to the new start date of date.start - start.delta and B[n] is filled to the new end date (if need be)? or is simulate_epi being called here too, so it's using the same code for filling in the fit sims?

Only check_B() overwrites B and this happens only when simulate_epi() is called. So, that should be airtight...

papsti commented 2 months ago

i agree, as long as there is no point where extra B dates get dropped in a way that gets re-written to B in obj... if so, the fill and drop cycle could cause simulations to happen with undesired behaviour at the ends of the simulation period... is there any dropping?

papsti commented 2 months ago

i can confirm that after updating to the latest version of reem, this issue is resolved!

just a note for myself, i documented my work understanding these bugs in reemscenex@bdbb0140