dkesada / dbnR

Gaussian dynamic Bayesian networks structure learning and inference based on the bnlearn package
GNU General Public License v3.0
44 stars 10 forks source link

Dependencies between time slices #18

Closed 1369959395 closed 1 year ago

1369959395 commented 1 year ago

Sorry to bother you, I have a little question to ask. Is our algorithm based on unstable conditions, and what improvements have been made compared with the traditional DBN? (1. The structure in each time slice of the network built by the traditional DBN is the same. 2. The dependency between time slices is also the same.) Are we only improving condition 2.

dkesada commented 1 year ago

Hi! I'm afraid I do not understand your question. I do not know which algorithm based on "unstable conditions" you are referring to. There are currently three structure learning algorithms in dbnR: dmmhc, psoho and natpsoho.

I also do not understand what do you mean by traditional DBNs. Do you mean DBNs with only 2 time slices? If so, increasing the Markovian order of the network to allow more than 2 time slices can be useful in situations where your time series have higher than 1 autoregresive order. The Markovian order 1 is an assumption that most DBN models tend to make because it simplifies the model greatly, and without using a temporal window it would become unfeasible to make long term forecasting. But sometimes one cannot assume that the future is independent of the past given only the present time.

If on the other hand you meant to say that by increasing the Markovian order we are only improving on the representation of the dependencies between time slices but the intra-slice arcs remain the same, then you are correct. If you increase the Markovian order of the network with the dmmhc algorithm, all the time slices have an homogeneous internal structure, but the dependencies between time slices become more complex.

1369959395 commented 1 year ago

So it seems that the dmmhc algorithm is used in the example you give?

dkesada commented 1 year ago

Yes, that's right. You can choose one of the other two algorithms by setting the method argument to "psoho" or "natpsoho" in the learn_dbn_struc() function.

1369959395 commented 1 year ago

Thank you. My last question is to provide inference and prediction under evidential variables.

  1. In this example, does it mean that when building a good network structure for reasoning, only the given two evidence variables are used for prediction, and the rest of the influence variables are not used.
res_ fore <- suppressWarnings(dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 100, len = 70, prov_ ev = c("stator_winding_t_0", "u_q_t_0")))
  1. In this example, the difference from the previous example is whether the "stator_winding_t_0" variable in each time period is a fixed value.
inter_ dt <- copy(f_dt_val)
inter_ dt[, stator_winding_t_0 := 0.32]
res_ fore <- suppressWarnings(dbnR::forecast_ts(inter_dt, fit, obj_vars = c("pm_t_0"), ini = 100, len = 70, prov_ ev = c("stator_winding_t_0", "u_q_t_0")))
  1. When reasoning and calculating the prediction accuracy, why are the results of the following two methods different? Shouldn't they be the same?

①res_ fore <- suppressWarnings(dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t0"), ini = 100, len = 2)) ②res fore <- suppressWarnings(dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t0"), ini = 100, len = 1)) res fore <- suppressWarnings(dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 101, len = 1))

dkesada commented 1 year ago

Ok, let me answer in order again:

  1. When performing forecasting with the forecast_ts() function, all the values from the variables in older time slices are used to predict the values of the variables in t_0. In addition, if you use the prov_ev argument to provide future evidence, the selected variables will also be used. In your example, all the variables from older time slices in addition to both "stator_winding_t_0" and "u_q_t_0" will be used as evidence.
  2. Yes, that is correct. In the first example, the values used as evidence are the ones that were already in the data. In this case, we set the values of "stator_winding_t_0" to 0.32 as an intervention on the system. This will show the effects that setting that fixed value will cause.
  3. In this scenario the results are different because you are using different data to perform inference in the two cases. Let me explain this in a little more detai:

In your first line you have

res_fore <- dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 100, len = 2)

This means that you start forecasting on the 100th instance of your dataset, and you will forecast 2 instants into the future. This means that you pick the values of the variables in instants 99 and older (depending on the 'size' parameter of your DBN) to perform forecasting of the 100th instant. Then, you use the values you forecasted for the 100th instant as evidence to forecast the 101th instant. It is important to note that the values you are using are not the ones in the dataset, but the values inferred by the model. If we were using future values to perform inference, we would be performing look-ahead bias, which would be impossible to do in a real world scenario.

Now let's take a look at the other two lines of code:

res_ fore <- dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 100, len = 1)
res_ fore <- dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 101, len = 1)

In this case, the first line takes the values in the dataset of the 99th instance and older to forecast the values of the 100th instant, similarly to the first step in the previous line. However, the second line takes the data of the 100th instance from the dataset, as opposed to the previous case where you where using predicted values instead of real ones.

I that clears some of your doubts.

1369959395 commented 1 year ago
  1. Does this mean that there is no difference between my 1.2 questions? When providing evidence variables at time t_0, the difference is that only one of them uses the data in the data table, and the other is given directly by me.

  2. Whether it is necessary to introduce parameter shift_values( ) to automatically update it when using forecast_ts( ) function for prediction.

    res_fore <- dbnR::forecast_ts(dbnR::shift_values(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 100, len = 2))

    After shift_values( ) is introduced, not only the pm variables are updated, but also the variables without prediction are updated.

res_fore <- dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 100, len = 2)

If you do not introduce shift_values( ), you will only update the pm variable. Is the above understanding correct?

  1. If I want to discuss the specific impact of a variable on the prediction variable, is it meaningful to observe by providing fixed evidence of the current time slice? In other words, is there any effective way to discuss the causal relationship between variables on the basis of a given network structure?
dkesada commented 1 year ago

The numeration is getting a little bit confusing, but let's see

  1. When you perform forecasting on a row of your dataset, you use the values of the variables in previous time slices to predict the values of t_0 in that row and the following ones. If you use the prov_ev argument to provide a variable in t_0 as evidence, then that variable will not be forecasted and the real value in the dataset will be used as evidence instead. The difference in your 1 and 2 previous questions is that in one you use the values of "stator_winding_t_0" that were already in the dataset and in the other you fix that value to 0.32.

  2. The shift_values() function should only be used if you want to forecast unknown data using the t_0 variables in the last row of your dataset as evidence. The only thing that this function does is moving all the values backwards, so that in a row of your data the values from the variables in t_0 are shifted to the variables in t_1, the values in t_1 to t_2, and so on. Afterwards, the values in t_0 are left as NA so that they can be predicted with the DBN. Doing this

    res_fore <- dbnR::forecast_ts(dbnR::shift_values(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 100, len = 2))

    Is equivalent to doing this

    res_fore <- dbnR::forecast_ts(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 101, len = 2)

    In conclusion, do not use the shift_values() function unless you are in a real world scenario and want to predict values that you do not have in your data, or if you preffer to have NAs in the variables you are predicting for some reason.

  3. You can use the prov_ev argument to perform interventions by fixing the values of some variable to see the effects on the forecasting. This is one of the inteded uses of this argument, so that the DBN models can be used as simulators. However, please keep in mind that the relationships shown in a Bayesian network are not causal relationships unless the network structure was built using specific methods for causal learning. The arcs in the graph represent conditional independence relationships based usually on the results of conditional independence tests or likelihood functions. With the prov_ev argument you can see the effects that a variable has on the final result of the prediction, or on some other variable, and you can evaluate its importance. But calling the relationship of two variables "causal" due to the results of some inference performed with a non-causal model is very debatable.

1369959395 commented 1 year ago

OK, thank you. I probably understand what you mean.

res_fore <- dbnR::forecast_ts(dbnR::shift_values(f_dt_val, fit, obj_vars = c("pm_t_0"), ini = 100, len = 2))

In this example, when the data in line 101 is predicted, the prediction value of the 100th instant is used as the evidence for predicting the 101st instant. Does this mean that although we have given prediction variables in "obj_vars", other variables will still be predicted and used for the next instant of evidence, but there is no output on the display results.

dkesada commented 1 year ago

Yes, that is correct. All the variables in t_0 are predicted to be able to perform forecasting, and only the ones given in obj_vars are returned.

1369959395 commented 1 year ago

This is very helpful to me. Thank you for your patience.