google / CausalImpact

An R package for causal inference in time series
Apache License 2.0
1.71k stars 254 forks source link

Incompatible confidence intervals between the series object and the summary output #75

Open abduazizR opened 6 months ago

abduazizR commented 6 months ago

Hi

I am facing a strange behavior that I don't know why I am seeing. Below is a reproducible example

library(tidyverse)
library(CausalImpact)
#> Loading required package: bsts
#> Loading required package: BoomSpikeSlab
#> Loading required package: Boom
#> 
#> Attaching package: 'Boom'
#> The following object is masked from 'package:stats':
#> 
#>     rWishart
#> 
#> Attaching package: 'BoomSpikeSlab'
#> The following object is masked from 'package:stats':
#> 
#>     knots
#> Loading required package: zoo
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric
#> Loading required package: xts
#> 
#> ######################### Warning from 'xts' package ##########################
#> #                                                                             #
#> # The dplyr lag() function breaks how base R's lag() function is supposed to  #
#> # work, which breaks lag(my_xts). Calls to lag(my_xts) that you type or       #
#> # source() into this session won't work correctly.                            #
#> #                                                                             #
#> # Use stats::lag() to make sure you're not using dplyr::lag(), or you can add #
#> # conflictRules('dplyr', exclude = 'lag') to your .Rprofile to stop           #
#> # dplyr from breaking base R's lag() function.                                #
#> #                                                                             #
#> # Code in packages is not affected. It's protected by R's namespace mechanism #
#> # Set `options(xts.warn_dplyr_breaks_lag = FALSE)` to suppress this warning.  #
#> #                                                                             #
#> ###############################################################################
#> 
#> Attaching package: 'xts'
#> The following objects are masked from 'package:dplyr':
#> 
#>     first, last
#> 
#> Attaching package: 'bsts'
#> The following object is masked from 'package:BoomSpikeSlab':
#> 
#>     SuggestBurn
set.seed(1)
x1 <- 100 + arima.sim(model = list(ar = 0.999), n = 100)
y <- 1.2 * x1 + rnorm(100)
y[71:100] <- y[71:100] + 10
data <- cbind(y, x1)

pre.period <- c(1, 70)
post.period <- c(71, 100)

impact <- CausalImpact(data, pre.period, post.period)

# Output 1
impact$series |> 
  data.frame() |> 
  slice(71:100) |> 
  rownames_to_column(var = "time") |> 
  tibble() |>  
  summarise(
    across(c(response, point.pred, contains("point.effect")), mean),
  )
#> # A tibble: 1 × 5
#>   response point.pred point.effect point.effect.lower point.effect.upper
#>      <dbl>      <dbl>        <dbl>              <dbl>              <dbl>
#> 1     117.       107.         10.5               7.80               13.3

# Output 2
impact$summary
#>               Actual      Pred Pred.lower Pred.upper    Pred.sd AbsEffect
#> Average     117.0485  106.5372   105.8365   107.2868  0.3724158  10.51134
#> Cumulative 3511.4555 3196.1154  3175.0955  3218.6046 11.1724731 315.34013
#>            AbsEffect.lower AbsEffect.upper AbsEffect.sd  RelEffect
#> Average           9.761698          11.212    0.3724158 0.09873264
#> Cumulative      292.850949         336.360   11.1724731 0.09873264
#>            RelEffect.lower RelEffect.upper RelEffect.sd alpha           p
#> Average         0.09098693        0.105937  0.003841021  0.05 0.001003009
#> Cumulative      0.09098693        0.105937  0.003841021  0.05 0.001003009

Created on 2024-05-27 with reprex v2.1.0

You can see that the effect estimate is exactly the same but the confidence intervals are different. I tried with different datasets, and I get the same result.

Thanks

sdamerdji commented 1 month ago

This is not strange behavior. The mean lower bound for 100 predicted effects is not the same as the lower bound of the mean of 100 predicted effects.