Closed belzheng closed 2 years ago
Hi @belzheng,
So the get_models_with_weights()
will give you the ensemble weights for each model included in the ensemble. From here, you can query the model's returned from this dictionary to get further information for that specific model. I'm not really sure what you mean by "regression coefficients of the final ensemble model" in this case.
The way the ensemble works for regression is we do a simple weighted sum over the regression outputs of each model to get a final output. In the case of classification, these are done with the probabilities before being turned to classes.
If you have some sample code of what you would like to be able to do, that would help in me understanding the question some more and can help influence further design :) Thanks for reaching out!
Best, Eddie
Ok, thanks, let me explain the problem in another way, here is a sample code to illustrate the question:
So,my question is how can I get the specific coefficents in the ensemble model?
Hi @belzheng,
Sorry for the lack of response, conferences have been taking up our time :)
So autosklearn doesn't have coefs_
. A simple model may have some simple learned parameters which are often exported as coefs_
but not every model will have that, take for example a KNNRegressir.
Autosklearn will train a large number of different scikit learn models and then produce an ensemble of them, as seen in models_with_weights()
. Most of these will probably no have any coefs_
.
A good way to illustrate the problem is if we're to have a coefs_
attribute like the regressor does, what would we put in there? There's no meaningful answer that we can come up with that makes sense. The closest reasonable answer is the weights of the models in the final produced ensemble but these models will be different from dataset to dataset and so comparing these weights from run to run has no real use.
If you were to share your use case, I could perhaps point you in the right direction but I feel there may be some misunderstanding of what autosklearn does.
Best, Eddie
Thank you very much!So my new question is, since there are no coefficients here, how do we make predictions for the regression?
You do not need coefficients for the task of regression, which is to estimate continuous numerical values from the data. You could create a model which always returns the mean of the training data and this is a perfectly valid model that has no coef_
but is technically a valid regression model.
The coefficients you mention are part of the LinearRegressor model as well as some others, generally "linear" models. There is a whole variety of other models that doesn't have these coefs_
in sklearn, for example a DecisionTree
. It can do regression without coefs_
but it has other parameters like tree_
that are learned from the data.
This question is no longer really concerned with auto-sklearn and so I will close the issue. I advise exploring the scikit learn documentation some more and find some "non-linear" models and investigate how they perform regression.
Best, Eddie
Hey, I think there was some talking past each other. Weights in a linear model are sometimes called "coefficients" and scikit-learn appears to have the convention of giving them the name coefs_
. EnsembleSelection is a linear model, so one could assume that there's a variable called coefs_
. However, we don't follow this convention here, and instead assign the weights to a variable called weights_
. I'm wondering whether we should change the variable name or simply add a second variable coefs_
for compatibility?
That's a good point I didn't consider but I think this would then raise quite a few questions as to what coefs_
means from autosklearn. I could see someone finding automl.coefs_
and then wondering what those are and what they mean. This would lead to the term weights_
and coefs_
both floating around.
I had a brief look at all the ensembles in sklearn as well as their general machine learning guide on ensembles and found no reference to coefs_
, my guess is that it is reserved for individual models who use the coefficients directly, and not as weights to other models.
Good points regarding the ensembles - we're somewhat a voting classifier, and that doesn't have coefs_
, too. Therefore, I think we can leave things as they are.
Thanks for your reply,and I'm Sorry for taking so long to reply. What I want to ask is if the lasso regression in sklearn is added to the regressor of autosklearn, so that the feature preprocessing of autosklearn can be used, but I don't know how to extract the coefficients of this linear model, just like sklearn can display the model coefficients. When I tried to extract the coefficients after feature transformation by myself, I found that the ''coef_'' attribute of lasso disappeared. Below is the sample code for you to understand my problem better. thanks !
Hi @belzheng,
Thanks for the more descriptive answer, you should get the underlying estimator you wrap in you LassoRegression
class.
stepauto = regpip1.steps
regressor = stepauto[-1][1]
regressor.coef_ # Error
regressor.estimator.coef_ # Should be here.
You don't need to refit this pipeline, you could also just get it directly from show_models()
, the main issue was you are trying to access the coef_
and the wrapped estimator
and you need to access the underlying estimator
to get all the sklearn parameters.
I hope this solves your problem?
Best, Eddie
Short Question Description
Hi, I would like to know if I use auto-sklearn for regression, like specifying that the regressor is random forest, I know that I can get the weights and models of the ensemble model by get_models_with_weights(), show_models().My question is whether I can get the regression coefficients of the final ensemble model by calling some APIs, if not, can users write their own code to get the coefficients or display expressions of the ensemble model?