google / CausalImpact

An R package for causal inference in time series
Apache License 2.0
1.71k stars 254 forks source link

Posterior predictive confidence intervals differ to calculation by `predict.bsts` #30

Open tcuongd opened 5 years ago

tcuongd commented 5 years ago

Hi, from looking at the code I can see that CausalImpact generates the posterior predictive samples by getting the state value draws and sampling noise with variance equal to the sigma.obs draws from the bsts model object.

When I try to replicate the inference step using predict.bsts, I get different results. In particular, the credible intervals produced by generating the posterior predictive with bsts.predict are larger than CausalImpact suggests. The bounds of the relative difference credible interval can differ by 2-3%. Is there inherently a difference between how CausalImpact prediction and predict.bsts works? I've attached example code below.

library(magrittr)
library(CausalImpact)

# Dummy data
set.seed(1)
x1 <- 8000 + arima.sim(model = list(ar = 0.99), n = 100)
y <- 1.2 * x1 + rnorm(100, 0, 1)
y[71:100] <- y[71:100] + 10
data <- cbind(y, x1)

pre.period <- c(1, 70)
post.period <- c(71, 100)

# Run CausalImpact
impact <- CausalImpact(data, pre.period, post.period, model.args = list(niter = 10000, standardize.data = F))

# Predictions from bsts
bsts_model <- impact$model$bsts.model
bsts_predict <-
  predict.bsts(bsts_model,
               newdata = data[post.period[1]:post.period[2], "x1"])

# Actuals as a matrix
observed_post <-
  data[post.period[1]:post.period[2], "y"] %>% 
  rep(nrow(bsts_predict$distribution)) %>% 
  matrix(nrow = nrow(bsts_predict$distribution), byrow = T)

# Calculate differences
ppd_diff <- observed_post - bsts_predict$distribution
# Cumulative to last day of post period
diff_cum <- rowSums(ppd_diff)
reldiff_cum <- rowSums(ppd_diff)/rowSums(observed_post)

# Calculated stats
cat("Absolute difference:\n")
c(quantile(diff_cum, 0.025), mn = mean(diff_cum), quantile(diff_cum, 0.975)) %>% 
  print()
# Returns 75.6, 204.3, 342.3

cat("Relative difference:\n")
c(quantile(reldiff_cum, 0.025), mn = mean(reldiff_cum), quantile(reldiff_cum, 0.975)) %>% 
  print()
# Returns 0.026%, 0.071%, 0.12%

# Compare to CausalImpact CIs
summary(impact)
# Returns 123.2, 203.7, 288.0
#         0.043%, 0.071%, 0.1%
nredell commented 5 years ago

I ran across this issue as well. My guess is that the difference lies in the fact that predict.bsts is predicting from period 101 onward. If you check bsts_model$timestamp.info from the code above and compare it to the what the predict.bsts help docs say it makes sense:

The time stamps give the time points as which each prediction is desired. They must be interpretable as integer (0 or larger) time steps following the last time stamp in object. If NULL, then the requested predictions are interpreted as being at 1, 2, 3, ... steps following the training data.

Looking through https://github.com/cran/bsts/blob/d20c0f3e3dc8a02bf4485234673004892d4a92a1/R/format.prediction.data.R might shed some light.