Multicollinearity in rlassoEffect: Consider Y = a*X + 0*Z + e, where X and Z are collinear. It seems in rlassoEffect, \tilde{X} is not exactly zero, which causes large numerical instability in the coefficient estimate for a (will be very large). Would it help if the package can detect multicollinearity in the original model before doing double lasso? Alternatively, we can remove the multicollinearities first in [X,Z].
How this came up is for the Penn reemployment analysis, once we include three-way interactions, rlassoEffect has a perfect fit for T4 (treatment) due to multicollinearities. The residualized T4 is so close to zero that the double lasso estimate blows up:
and if we remove the multicollinearities before double lasso, then the estimate is reasonable again:
Thank you very much -- this is an awesome observation. We need to think about how to “auto-detect” and remove this problem.
How this came up is for the Penn reemployment analysis, once we include three-way interactions, rlassoEffect has a perfect fit for T4 (treatment) due to multicollinearities. The residualized T4 is so close to zero that the double lasso estimate blows up:
and if we remove the multicollinearities before double lasso, then the estimate is reasonable again:
Thank you very much -- this is an awesome observation. We need to think about how to “auto-detect” and remove this problem.