Closed lukasvermeer closed 2 years ago
Hi, Lukas! The statement is indeed only true if you compute a CI 'the normal way' as you describe it - like a CI around a mean difference. It is possible, in theory, to contruct different 95% CI which does not correspond to a statistical test. To keep the text readable, I never go into detail in hypothetical edge cases. I feel that's a bit more defensible in a text that provides a formal treatment of these issues, while my textbook has a bit more applied focus. In general, any CI that you will get out of statistical software packages will show this relationship. I say in general, because there are sometimes dozens of ways to compute a CI, some with slightly better coverage, and then the relationship does not formally hold (but this happens at a number after the digit that has little relevance in practice).
Clear. Fair. Thank you. Good to hear I was not completely wrong all those years. 😅
I think another more technical consideration might be that CIs might be approximated (for example using Fieller's theorem) rather than exact. Using approximate CIs rather than exact p-values would impact precision.
On this page there is this strong statement of equivalence.
I always thought that this approach was usually reasonable in practice, but not true by definition. As in: I thought it was theoretically possible to construct valid confidence intervals which could not be used in the way described here.
A bad example of this might be:
Obviously one would never do that, but I think that would be a valid 95% confidence interval, because it "covers the true parameter 95% of the time" (which I think is the only requirement for a CI to be valid). Yet it cannot be used in the way described in the quote.
I would love to be corrected here. Are there any references that explain why the relationship between CI and statistical significance is indeed direct?