google / CausalImpact

An R package for causal inference in time series
Apache License 2.0
1.71k stars 254 forks source link

Several questions about the codes #45

Open MengDH opened 3 years ago

MengDH commented 3 years ago

Hi CausalImpact team,

I'm trying to use the CausalImpact package to implement one of our company's project. Here are some questions after I reviewing the raw code from this repository:

  1. I noticed that in the file impact_model.R The expected.model.size of the function bsts is fixed to be 3 (by the variable kStaticRegressionExpectedModelSize). Is there any reason for this parameter to be fixed as 3 ? Also, Is that possible to make it adjustable in the future is by model.args ?

  2. I also noticed that both pre.period and post.period control groups' data are included in one bsts function, which may cause while estimating the posterior distribution of the coefficients by using the pre.period data, the post.period information is also used.

For example: If my dataset test is like:

date_period, target, control_1, control_2 1, 100 , 3, 0, 2, 90, 2, 0, 3, 80, 2, 0, 4, 70, 3, 1, 5, 60, 3, 1

code of generating this test data:

target = c(100, 90, 80, 70, 60)
control1 = c(3,2,2,3,3)
control2 = c(0,0,0,1,1)

test = data.frame(target, control1, control2)

and pre.period = c(1,2), post.period = c(3,4)

In this case, if you run the command:

impact <- CausalImpact(test, pre.period, post.period, model.args = list(standardize.data=F, niter = 100),alpha = 0.9)
coef = as.data.table(impact$model$bsts.model$coefficients)

you can find control_2 also shows non-zero coefficients in each iteration.

This is of course an extreme case. My question is, have you tried how different will be the results: between using current method, and the traditional predict(bsts) after building the model by only using pre.period data?

Thank you in advance!