Open paroussisc opened 6 years ago
p-values and the likelihood ratio test
A common test statistic is the likelihood ratio test:
which, by the Neyman-Pearson Lemma, is the most powerful test possible for simple hypotheses.
There is some controversy surrounding the use of p-values, and this article gives a nice explanation: https://www.vox.com/science-and-health/2017/7/31/16021654/p-values-statistical-significance-redefine-0005, an excerpt:
In a 2013 PNAS paper, Johnson used more advanced statistical techniques to test the assumption researchers commonly make: that a p of .05 means there’s a 5 percent chance the null hypothesis is true. His analysis revealed that it didn’t. “In fact there’s a 25 percent to 30 percent chance the null hypothesis is true when the p-value is .05,” Johnson said.
Practical ways to avoid such issues are:
Model checking
Model Comparison
The Bayesian Approach
It is worth noting that, as the sample size tends to infinity, the posterior distribution dominates the prior, and therefore parameter values corresponding to the posterior mode tend towards the maximum likelihood estimate.
Model Comparison
This link gives a nice succinct comparison of these model comparison techniques, with some resources linked to: https://www.stites.io/posts/2017-10-09-bic-dic-cv.html. What I take from it is that DIC is more useful than BIC, for the most part, the Cross Validation (which can also be used in a Bayesian setting) is the strongest, but more computationally intensive and doesn't have any theoretical claims on insights for future data.
In the MLE setting, under some regularity conditions, and in the large sample limit:
And in fact the follwing often holds, in the Bayesian setting, in the large sample limit:
Random effects can be used to share information between observations and across covariates. One example would be to treat weather effects on total goals as random effects. This means small-sample weather effect types could borrow information from the common mean and not be affected so much by small sample outliers.
One reason for using random effects is the interpretability of them. For example, it allows us to include patient effects in medical trials - modelling the random variability between patients while allowing inferences to be made about their shared covariates (treatment type, fat mass, height etc). Random effects to a Frequentist are what hierarchical modelling is to a Bayesian.