Open schneiderpy opened 11 months ago
Dear Andreas, thank you for your question! Yes, you can include a control group among the covariates. To do that, you can use the option xreg
within the CausalArima
function. The xreg
option allows you to incorporate a vector, matrix or data.frame of regressors that can be helpful in explaining the outcome y
in the absence of the intervention. So, it can be used to include both covariates and a control group, if available. Please feel free to reach out if you have any further inquiries
Thank you Fiammetta for your quick reply. I will give it a try ...
Dear Fiammetta a quick question: I suppose that I have to change from long to wide format to include the control group values (as a time series) to construct a xreg like this ... xreg = cbind(valuescontrolgroup + grouptreatment + grouptrend + othercovariate)
Dear Fiammetta I am trying to interpret the results of the main function causalarima(), and in particular the significant p-values. How can I obtain these? For example, I have the following results
summary_model $impact_norm $impact_norm$average estimate sd p_value_left p_value_bidirectional p_value_right 1 -44.29584 7.963023 1.328225e-08 2.656451e-08 1
$impact_norm$sum estimate sd p_value_left p_value_bidirectional p_value_right 1 -1417.467 254.8167 1.328225e-08 2.656451e-08 1
$impact_norm$point_effect estimate sd p_value_left p_value_bidirectional p_value_right 1 -66.95355 21.28963 0.000830745 0.00166149 0.9991693
$impact_boot $impact_boot$average estimates inf sup sd observed 66.593750 NA NA NA forecasted 110.889589 97.5934997 125.9130978 7.44123098 absolute_effect -44.295839 -59.3193478 -30.9997497 7.44123098 relative_effect -0.399459 -0.5349406 -0.2795551 0.06710487
$impact_boot$effect_cum estimates inf sup sd observed 2131.000000 NA NA NA forecasted 3548.466840 3122.9919905 4029.2191295 238.11939133 absolute_effect -1417.466840 -1898.2191295 -991.9919905 238.11939133 relative_effect -0.399459 -0.5349406 -0.2795551 0.06710487
$impact_boot$p_values alpha p 0.05 0.00 The result should be somehow significant ...
Dear Andreas, it seems that the results are significant. For example, if you take the results under the Normality assumption, the estimated cumulative effect is -1417, the estimated standard deviation is 254 and the bidirectional p-value is 0, meaning that you reject the null hypothesis that the cumulative effect is 0. If summary_model
is the output of CausalARIMA()
, you can also use summary(summary_model)
and you can plot the causal effect with plot(summary_model, type = "impact")
or the comparison between the observed and forecasted series with plot(summary_model, type = "forecast")
Dear Fiammetta, thank you for your prompt reply. What does the $impact_boot$p_values indicate?
The _pvalues are defined (as in Brodersen) CausalImpact are defined as:
min(mean(y.samples.post.sum >= y.post.sum), mean(y.samples.post.sum<= y.post.sum))
which is very similar to the formula they have used:
p <- min(sum(c(y.samples.post.sum, y.post.sum) >= y.post.sum), sum(c(y.samples.post.sum, y.post.sum) <= y.post.sum)) / (length(y.samples.post.sum) + 1)
The difference is that they add one in the denominator, to avoid the possibility of having p-values of exactly 0.
It should be mentioned that in both of the formulas more than p-values, these values should probably be called Probability of Direction, but we have kept the term p-value to keep the names similar to CausalImpact, also they are basically the bayesian equivalent of the p-values and the term probabilty of direction is less known.
In practice this measure refers to the one-sided tail area probability of overall impact in the entire period post intervention, or in other words the total or average effect.
Thank you @palmierieugenio for the clarification.
Hello Eugenio Unfortunatly I get still confused using your p-value(s). For example, the output below indicates a significante bidirectional p-value for the cummulative sum (actually all three p-values are significant). However, the "overall" p-value is not significant. Which one should I use?
$impact_norm $impact_norm$average estimate sd p_value_left p_value_bidirectional p_value_right 1 -43.4874 6.467814 8.86062e-12 1.772116e-11 1
$impact_norm$sum estimate sd p_value_left p_value_bidirectional p_value_right 1 -521.8488 77.61376 8.86062e-12 1.772116e-11 1
$impact_norm$point_effect estimate sd p_value_left p_value_bidirectional p_value_right 1 -70.51046 22.40516 0.0008245977 0.001649195 0.9991754
$impact_boot $impact_boot$average estimates inf sup sd observed 111.8333333 NA NA NA forecasted 155.3207303 105.0784663 126.4281383 5.61229305 absolute_effect -43.4873970 -14.5948050 6.7548671 5.61229305 relative_effect -0.2799845 -0.0939656 0.0434898 0.03613357
$impact_boot$effect_cum estimates inf sup sd observed 1342.0000000 NA NA NA forecasted 1863.8487640 1260.9415952 1517.1376601 67.34751658 absolute_effect -521.8487640 -175.1376601 81.0584048 67.34751658 relative_effect -0.2799845 -0.0939656 0.0434898 0.03613357
$impact_boot$p_values alpha p 0.050 0.243
Dear Fiammetta I came accross your package and I would like to know if there is a way to include a control group in the C-ARIMA approach?
Thank you in advance Andreas