One of the main tasks of statistical modeling is to exploit the association between a response variable and multiple predictors. Linear model (LM), as a simple parametric regression model, is often used to capture linear dependence between response and predictors. Generalized linear model (GLM) can be considered as the extensions of linear model, depending on the types of responses. Parameter estimation in these models can be computationally intensive when the number of predictors is large. Meanwhile, Occam's razor is widely accepted as a heuristic rule for statistical modeling, which balances goodness of fit and model complexity. This rule leads to a relative small subset of important predictors.
BeSS package provides solutions for best subset selection problem for sparse LM, and GLM models.
We consider a primal-dual active set (PDAS) approach to exactly solve the best subset selection problem for sparse LM and GLM models. It utilizes an active set updating strategy and fits the sub-models through use of complementary primal and dual variables. We generalize the PDAS algorithm for general convex loss functions with the best subset constraint.
The package has been publish in PyPI. You can easy install by:
$ pip install bess
To download and install BeSS from CRAN:
install.packages("BeSS")
Or try the development version on GitHub:
# install.packages("devtools")
devtools::install_github("Mamba413/bess/R")
Please send an email to Jiang Kangkang(jiangkk3@mail2.sysu.edu.cn).