CausalEstimator reporting a 90% instead of 95% confidence interval for bootstrapping?

YichenTang97 commented 8 months ago

Problem description The current implementation of _estimate_confidence_intervals_with_bootstrap method under the CausalEstimator class might be reporting a 90% confidence interval (CI) instead of the 95% CI under the default confidence_level setting of 0.95.

The current implementation for obtaining the CI seem to follow the Pivotal Intervals method (see section 8.3 of [1], and section 6 of the reading material refered in the code comment [2]). Given $x_1, x_2, . . . , x_n$ as the observed sample with size N drawn from a distribution $F$ and $\bar{x}$ as the observed sample mean. Let's denote $x_1^*, x_2^*, . . . , x_n^*$ as a resample of the data of the same size N, and $\bar{x}^*$ as the mean of this resample. One can estimate the CI of significance level $\alpha$ (usually 0.05) as such:

CI = (\bar{x} - \delta^*_{1-\alpha/2}, \bar{x} - \delta^*_{\alpha/2}),

where $\delta^* = \bar{x}^* - \bar{x}$ is the distribution of the sample mean differences for some bootstrap resamples, and $\delta^*_i$ denotes the $100 \cdot i$ th percentile of $\delta^*$.

For a significance level $\alpha=0.05$ (i.e 95% CI), we should find the 2.5 th percentile and 97.5 th percentile such that $CI = (\bar{x} - \delta^*_{0.975}, \bar{x} - \delta^*_{0.025})$. However, in the current implementation, the _estimate_confidence_intervals_with_bootstrap method is returning $CI = (\bar{x} - \delta^*_{0.95}, \bar{x} - \delta^*_{0.05})$ for confidence_level=0.95, which in fact reports the 90% CI.

Could someone investigate into this and make changes if necessory? It would also be helpful to implement an option for choosing the computing method for CI (e.g. pivotal, percentile, normal, etc.).

Version information

DoWhy v0.10.1

References [1] L. Wasserman, All of statistics: a concise course in statistical inference, vol. 26. Springer, 2004. [2] Reading 24 Bootstrap Confidence Intervals (https://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-statistics-spring-2014/readings/MIT18_05S14_Reading24.pdf)

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 14 days with no activity.

github-actions[bot] commented 7 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.

amit-sharma commented 7 months ago

thanks for raising this @YichenTang97 will take a look

github-actions[bot] commented 7 months ago

This issue is stale because it has been open for 14 days with no activity.

github-actions[bot] commented 7 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.

py-why / dowhy

CausalEstimator reporting a 90% instead of 95% confidence interval for bootstrapping? #1063