stan-dev / rstanarm

rstanarm R package for Bayesian applied regression modeling
https://mc-stan.org/rstanarm
GNU General Public License v3.0
385 stars 132 forks source link

Fix for perfect collinearity in linear regression #534

Closed jgabry closed 2 years ago

jgabry commented 3 years ago

rstanarm's trick of parameterizing the log-likelihood using OLS coefficients as sufficient statistics makes linear regression much faster but, as @andrewgelman points out, it fails in the case of perfect collinearity.

@bgoodri Is your preferred solution still to use "minimum norm" OLS coefficients instead? If so can you implement that for the next release?

Here's a minimal example based on what @andrewgelman originally sent.

library(rstanarm)
N <- 100
y <- rnorm(N)
z <- sample(c(0,1), N, replace=TRUE)
x1 <- rnorm(N)
x2 <- 2*x1

fit_1 <-
  stan_glm(
    y ~ z * (x1 + x2),
    data = data.frame(y, z, x1, x2),
    prior = normal(location = 0, scale = 0.1),
    prior_intercept = normal(location = 0, scale = 0.1)
  )
bgoodri commented 3 years ago

Yeah, the functionality is already in Eigen but not exposed in Stan so we need to define another C++ function in the rstanarm headers that looks like

#ifndef RSTANARM__CODOLS_HPP
#define RSTANARM__CODOLS_HPP

/*
 * Compute ordinary least squares coefficients,
 * even in the situation where X is rank deficient
 * See https://eigen.tuxfamily.org/dox/classEigen_1_1CompleteOrthogonalDecomposition.html
 * /
template <typename T2__, typename T3__>
inline
Eigen::Matrix<typename boost::math::tools::promote_args<T2__, T3__>::type,
              Eigen::Dynamic, 1>
CODOLS(const Eigen::Matrix<T2__, Eigen::Dynamic, Eigen::Dynamic>& X,
       const Eigen::Matrix<T3__, Eigen::Dynamic, 1>& y,
       std::ostream* pstream__) {
  typename boost::math::tools::promote_args<T2__, T3__>::type T1__
  Eigen::CompleteOrthogonalDecomposition<T1__> cod(X);
  return cod.solve(y);
}
jgabry commented 2 years ago

@bgoodri Any update on this issue? Andrew was asking about it again recently. At one point I think you said you had a branch, or am I misremembering? If so is it ready for Andrew to try it?