dkesada / dbnR

Gaussian dynamic Bayesian networks structure learning and inference based on the bnlearn package
GNU General Public License v3.0
44 stars 10 forks source link

Does the predict_dt function use the current time information? #16

Closed CadePGCM closed 1 year ago

CadePGCM commented 1 year ago

lets say our variables are a_t_0, a_t_1 and b_t_0, b_t_1. Then if I call predict_dt with obj_var = a_t_0, will it be using b_t_0 in that prediction? (introducing lookahead bias)

dkesada commented 1 year ago

Hi! Yes, the predict_dt function predicts the values of the the obj_var variables using all of the remaining ones as evidence. To predict the value of a_t_0 without using the rest of the variables in the t_0 slice you should input all of the variables in t_0 in the obj_var argument, which in this case would be something like obj_var = c("a_t_0", "b_t_0") in the function call, but it would be better to do it programmatically if there are too many variables. This is also the case with the mvn_inference function, which is one of the most elemental inference functions in the package. On the other hand, this is different for the forecast_ts function, where all the variables in t_0 are forecasted simoultaneously and no lookahead bias is introduced unless the prov_ev argument is used to introduce interventions in the forecasting.

I rarely use the predict_dt function, and I'm realizing now that this behaviour can be misleading, given that it is natural to only introduce in the obj_var argument the variables that one is interested in. I'll modify the documentation and think about changing the base behaviour of the function in a future version. Thanks for the heads up!

CadePGCM commented 1 year ago

Thanks a lot for the helpful response so fast.

dkesada commented 1 year ago

This is now fixed on commit 8e8151d4f3f40f6b7813ef3727a6086844f33a4b in the devel branch. There is now a boolean parameter look_ahead which defaults to False that specifies whether or not the variables in t_0 should be used as evidence even if they do not appear in the obj_nodes parameter. Now, unless the user specifies otherwise, no look-ahead bias will be introduced.