Separation is a current problem with bas.glm in logistic regressions as which uses nested integrated Laplace approximations which depend on MLEs. With quasi or perfect separation, MLEs are not identifiable, which Firth Logistic regression addresses through data augmentation or equivalently a penalty term using Jeffreys invariant prior.
For DA, an augmented data set with $3n$ observations is constructed with stacking the design matrix 3 times and a response vector that is (the original Y, a nx1 vector of 1's, and a nx1 vector of 0's). The augmented response are accompanied by weights: $w_{i} = hi /2$ where
$$H = W^{1/2}X(X^TWX)^{-1}X^TW^{1/2}$$
and $W$ is a diagonal matrix with $[W]{ii} = \pi_i ( 1 - \pi_i)$.
Questions in implementation:
need to fix issue related to incorrect standard errors in weighted regressions and GLMS - issue #52
should weights be based on full model or different for each model. (seems like model dependent ideal)
can data augmentation be based on adding $n$ observations through sufficiency? i.e add proportion 0.5 and weights $w_i = h_i$?
should require iterative updating to get MLE's but should be feasible with current iteratively weighted least squares implentation for logistic regression
Separation is a current problem with
bas.glm
in logistic regressions as which uses nested integrated Laplace approximations which depend on MLEs. With quasi or perfect separation, MLEs are not identifiable, which Firth Logistic regression addresses through data augmentation or equivalently a penalty term using Jeffreys invariant prior.For DA, an augmented data set with $3n$ observations is constructed with stacking the design matrix 3 times and a response vector that is (the original Y, a nx1 vector of 1's, and a nx1 vector of 0's). The augmented response are accompanied by weights: $w_{i} = hi /2$ where $$H = W^{1/2}X(X^TWX)^{-1}X^TW^{1/2}$$ and $W$ is a diagonal matrix with $[W]{ii} = \pi_i ( 1 - \pi_i)$.
Questions in implementation:
of interest for JASP @vandenman ?