abess-team / abess

Fast Best-Subset Selection Library
https://abess.readthedocs.io/
Other
474 stars 41 forks source link

What do multiple sets of coefficients mean? #535

Open mytarmail opened 6 months ago

mytarmail commented 6 months ago

Here's a more detailed question Why does the model have several sets of coefficients, in this case there are 5 of them, and how to tell which set of coefficients belongs to the current trained model

Code for Reproduction

Paste your code for reproducing the bug:

y <- matrix(rnorm(200), ncol = 4) ; colnames(y) <- paste0("y", 1:ncol(y))
x <- matrix(rnorm(200), ncol = 4) ; colnames(x) <- paste0("x", 1:ncol(x))
library(abess)
abess_fit <- abess(x, y, family = "mgaussian")

abess_fit[["beta"]]
abess_fit[["intercept"]]
abess_fit[["beta"]]
$`0`
4 x 4 diagonal matrix of class "ddiMatrix"
   y1 y2 y3 y4
x1  0  .  .  .
x2  .  0  .  .
x3  .  .  0  .
x4  .  .  .  0

$`1`
4 x 4 sparse Matrix of class "dgCMatrix"
           y1         y2        y3        y4
x1  .         .          .          .       
x2 -0.1015219 0.02199386 0.1122985 -0.250586
x3  .         .          .          .       
x4  .         .          .          .       

$`2`
4 x 4 sparse Matrix of class "dgCMatrix"
            y1          y2          y3          y4
x1  .           .           .           .         
x2 -0.09682036  0.03178114  0.11441030 -0.24966314
x3  .           .           .           .         
x4 -0.11844742 -0.24657644 -0.05320414 -0.02324879

$`3`
4 x 4 sparse Matrix of class "dgCMatrix"
            y1          y2          y3          y4
x1  .           .           .           .         
x2 -0.07059541  0.03911579  0.10756961 -0.23451269
x3  0.17556163  0.04910145 -0.04579469  0.10142396
x4 -0.14374648 -0.25365213 -0.04660497 -0.03786434

$`4`
4 x 4 sparse Matrix of class "dgCMatrix"
            y1          y2          y3          y4
x1 -0.01256233  0.02601888  0.10887272 -0.07605923
x2 -0.06877012  0.03533528  0.09175053 -0.22346137
x3  0.17846661  0.04308472 -0.07097094  0.11901226
x4 -0.14415620 -0.25280352 -0.04305406 -0.04034503

> abess_fit[["intercept"]]
[[1]]
[1]  0.10044828  0.11732645 -0.15248544  0.07686929

[[2]]
[1]  0.1096521  0.1153325 -0.1626663  0.0995871

[[3]]
[1]  0.10848385  0.11290046 -0.16319105  0.09935779

[[4]]
[1]  0.08672936  0.10681612 -0.15751646  0.08678998

[[5]]
[1]  0.08584996  0.10863751 -0.14989508  0.08146563
Mamba413 commented 6 months ago

@mytarmail , thanks for this question.

abess would consider multiple possible size of subset. In your code example, since the dimension of x is 4, the possible size of subsets are 0, 1, 2, 3, 4. This leads to 5 sets of coefficients.

I guess you want the coefficients under the optimal size (i.e., well balance predictive accuracy and model complexity). Then you can use the following code to get the coefficient under the optimal subset size:

abess_fit <- abess(x, y, family = "mgaussian")
extract(abess_fit)
mytarmail commented 6 months ago

@mytarmail , thanks for this question.

abess would consider multiple possible size of subset. In your code example, since the dimension of x is 4, the possible size of subsets are 0, 1, 2, 3, 4. This leads to 5 sets of coefficients.

I guess you want the coefficients under the optimal size (i.e., well balance predictive accuracy and model complexity). Then you can use the following code to get the coefficient under the optimal subset size:

abess_fit <- abess(x, y, family = "mgaussian")
extract(abess_fit)

Thanks for the quick response. My problem is that I want not only to get coefficient using extract(abess_fit) but also be able to change these coefficients inside the abess_fit model see my first example here