automl / auto-sklearn

Automated Machine Learning with scikit-learn
https://automl.github.io/auto-sklearn
BSD 3-Clause "New" or "Revised" License
7.58k stars 1.28k forks source link

Revisited: Missing multiclass-multioutput support #1685

Open tron27 opened 1 year ago

tron27 commented 1 year ago

Good morning all, I am revisiting the concern regarding if/when auto-sklearn will provide support for multiclass-multioutput data. The original issue can be found below:

https://github.com/automl/auto-sklearn/issues/292

In brief, it was mentioned (from the link above) that once scikit-learn provides metrics to evaluate multiclass-multioutput predictions, there will be a clean issue for auto-sklearn to also provide support for multiclass-multioutput data. It seems as though scikit-learn now offers/supports multiclass-multioutput. See the link below:

https://scikit-learn.org/stable/modules/multiclass.html

My question is, does any of the contributors know where auto-sklearn is regarding support for this feature? I would like to apply automl to predict a location (latitude/longitude). Lastly, I would like to thank all of the contributors for all that you do! I and the community greatly appreciate it. Attached is a screenshot of me trying to fit my X_train and y_train data. Unfortunately, I get an error regarding continuous-multioutput not being supported (see screenshot below). As you can see, I'm trying to apply automl in four lines of code.

Continuous multioutput error:

continuous multioutput error

System Details: Google Colab which runs on Ubuntu version 22.04 (see screenshot)

Google Colab OS

auto-sklearn: 0.16.0.dev0 (see screenshot) - Thank you to @Frankothe196 for helping me with this!

auto-sklearn install auto-sklearn version

Thanks, Eric

System Details (if relevant)

charlesfu4 commented 1 year ago

Hi @tron27. If I understood it correctly. You're trying to conduct a classification task with labels in continuous variables type which is not possible. Have you tried making the continuous variable binned or into categorial ones? If the purpose is to predict continuous variables, instead of using classification models, one should go with regression models.

Kr, Charles