Closed mshqn closed 8 months ago
Hi @mshqn ,
I am glad to see that you are interested in using Stabl. The stability path is indicative of an overly broad exploration of the penalty parameter $\lambda$: the selection frequencies of the chosen features decrease drastically after surpassing the threshold.
To obtain a more interpretable result, it is necessary to reduce the C_max, for example, to 2, and you will obtain the following result.
stabl = Stabl(
model,
lambda_grid={"C": np.linspace(0.00001, 2, 100)},
n_bootstraps=1000,
artificial_type="knockoff",
verbose=1,
random_state=1,
)
stabl.fit(X, y)
plot_stabl_path(stabl)
stabl.get_feature_names_out()
Selected features: array(['x1', 'x3', 'x4', 'x5', 'x7'], dtype=object)
The informative ones are normally ['x0', 'x1', 'x2', 'x3', 'x4']
, which is close to the result.
By construction, the x7
and x5
features are redundant ones, i.e are generated as random linear combinations of the informative features., thus they could be informatives in one way.
In this example, it means the variables ['x0', 'x1', 'x2', 'x3', 'x4']
and ['x1', 'x3', 'x4', 'x5', 'x7']
are interchangeable in a multivariate analysis, as they provide the same information.
Hi xavdurand,
Thanks for your answer. This is actually strange as larger C values correspond to weaker regularization, and we could have expected an increase in sparsity after decreasing max limit... If I am not mixing up things.
I wonder what is the smart way to set lambda limits for Stabl? Max lambda = 2 caused it to select large feature subsets on my data. I tried to increase it as I need a more sparse subset. But if I had no preference for sparsity, what should I have done?
I think fitting regular Lasso with cv to select lambda.min would not be affected by changes in lambda limits unless they change lambda.min.
I will answer step by step:
Thanks for your answer. This is actually strange as larger C values correspond to weaker regularization, and we could have expected an increase in sparsity after decreasing max limit... If I am not mixing up things.
Following the sklearn LogisticRegression object documentation, in the case of classification, the C
parameter is used and is the inverse of regularization strength: a greater value decrease the sparsity.
I wonder what is the smart way to set lambda limits for Stabl? Max lambda = 2 caused it to select large feature subsets on my data. I tried to increase it as I need a more sparse subset. But if I had no preference for sparsity, what should I have done?
It is possible to determine good range of C
using the l1_min_c
function from sklearn.svm
:
# X is the input and y is the output
min_C = l1_min_c(X, y, loss="log")
Then, based on min_C
, you can construct an interesting range by increasing it by a fixed step.
If you are interested in using Stabl, we can discuss by email: xdurand@surge.care.
Thanks, I didn't know about this function. Hope this will be useful for other users.
If you are interested, there is a similar behavior in Regression task with the max value of the $\lambda$ parameter of the Lasso :
source: Friedman, J. H., Hastie, T., & Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1–22. https://doi.org/10.18637/jss.v033.i01
First of all, thank you for the interesting paper and the package! Your results on correlated data were promising, so I wanted to try Stabl on my data, which suffers from multicollinearity.
I got some pretty weird results (output with no features frequently selected by other FS methods), so I wanted to try Stabl on a simple make_classification problem.
Here is my regular Lasso:
array([[ 0. , -0.54880256, 0. , -1.978292 , -0.64994893, -1.26612462, 0. , -0.72129673, 0. , 0. ]])
And here is what I get with Stabl:
array(['x3', 'x5', 'x7'], dtype=object)
How can we interpret the fact that Stabl selects only 3 features when 5 are informative?