automl / auto-sklearn

Automated Machine Learning with scikit-learn
https://automl.github.io/auto-sklearn
BSD 3-Clause "New" or "Revised" License
7.66k stars 1.28k forks source link

Fails when installing via pip #1681

Closed FranklinBarto closed 1 year ago

FranklinBarto commented 1 year ago

Describe the bug

The package fails to install via pip, it seems this package depends on scikit-learn=0.24.0. Which in itself fails to install.

To Reproduce

Steps to reproduce the behavior:

  1. pip install auto-sklearn

Expected behavior

It is expected that auto-sklearn to install

Actual behavior, stacktrace or logfile

pip install auto-sklearn Defaulting to user installation because normal site-packages is not writeable Collecting auto-sklearn Using cached auto-sklearn-0.15.0.tar.gz (6.5 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Collecting scikit-learn<0.25.0,>=0.24.0 Using cached scikit-learn-0.24.2.tar.gz (7.5 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... error error: subprocess-exited-with-error

× Preparing metadata (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [786 lines of output] Partial import of sklearn during the build process. C compiler: x86_64-linux-gnu-gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -fPIC

Environment and installation:

Please give details about your installation:

The installation also fails on google colab

eddiebergman commented 1 year ago

Hi @Frankothe196, scikit-learn doesn't have a compatible 3.10 version of scitkit-learn==0.24.2. My suggestion is to downgrade your python to 3.9. If that is not possible, it's possible to download the scikit-learn source code and use python -m build --wheel . inside the source directory with a python 3.10 environment.

In the very worst case, I have also pre-compiled a version for Linux which you can download from this link: https://ml.informatik.uni-freiburg.de/~bergmane/scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl and install with pip install scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl. This is not a permanent solution and I will likely remove this sometime soon, do not depend on it for any long term solution

FranklinBarto commented 1 year ago

I figured that I needed to downgrade, but I don't want to do that. The reason being google colab now ships with python 3.10 by default. I found a scikit-learn version 1.3.0 is compatible with python 3.10 which I have been working with. Question is why does auto-sklearn depend on the outdated scikit-learn? Thank for the pointer thought Ill explore the source codes and see if I can upgrade this code base to work with scikit-learn 1.3.0. Ill also look into your suggestions shortly!

eddiebergman commented 1 year ago

The reason being google colab now ships with python 3.10 by default. I found a scikit-learn version 1.3.0 is compatible with python 3.10 which I have been working with.

That's exactly why I manually built the scikit-learn==0.24.2 above to work with python 3.10. If you find a better solution please do let me know but putting the following into a cell works on our end:

!wget https://ml.informatik.uni-freiburg.de/~bergmane/scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl
!pip uninstall --yes scipy
!pip install ./scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl
!pip install auto-sklearn==0.15.0
!pip install -U numpy==1.23.5

Question is why does auto-sklearn depend on the outdated scikit-learn?

We need to update all the meta-data that autosklearn relies on. However there was a lot more things enabled in newer scikit-learn versions and it was also time for a refactor. Please see issue #1677 :)

FranklinBarto commented 1 year ago

Ooooh ok, this makes a lot of sense now @eddiebergman

So what i did was I forked the auto-sklearn repo and removed the explicit setting of the scikit-learn version in the requirements.txt.

Basically; scikit-learn>=0.24.0,<0.25.0 changed to scikit-learn

So now I can install my version via: !pip install git+https://github.com/Frankothe196/auto-sklearn.git@python3.10-added-compatibility

It seems to be installing fine, but I'm fairly new to machine learning. So maybe there's some underlying issues I wouldn't be able to spot.

tron27 commented 1 year ago

Hello all, I realized that this issue is closed, but I just came across it. I've been struggling with getting auto-sklearn to work properly for about 3 days now. I've downgraded my python version from 3.10.12 to 3.9.18 in Google Colab. Though, I was able to install auto-sklearn, I'm unable to import it or any of it's other functions such as classification. Attached is my Google Colab file. Any help will be greatly appreciated. I've sincerely tried to figure this out, but it's above my capabilities at this point. Much of the code in this file came from stackover flow. I understood it enough to use it. Thanks, Eric! Auto-sklearn Install.ipynb - Colaboratory.pdf

FranklinBarto commented 1 year ago

Hey @tron27 If you're sure auto-sklearn installed successfully, your issue is probably some complications in how you are using/configuring pip or python. You could look into how to reconfigure pip after a downgrade with google colab. Maybe this is an issue other have experienced and been able to fix.

you can try this code below to downgrade python and reconfigure pip

!sudo apt-get update -y

!sudo apt-get install python3.8

!sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1

!sudo update-alternatives --config python3

!apt-get install python3-pip

!python -m pip install --upgrade pip --user

But I would personally advice you to just use the default google version in google colab and follow the steps by @eddiebergman to install autosklearn.

!wget https://ml.informatik.uni-freiburg.de/~bergmane/scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl
!pip uninstall --yes scipy
!pip install ./scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl
!pip install auto-sklearn==0.15.0
!pip install -U numpy==1.23.5

You could choose to test out my solution below to see if the import would work, let me know how it work out for you if you take this route.

!pip install  git+https://github.com/Frankothe196/auto-sklearn.git@python3.10-added-compatibility

The line above also installs autosklearn, but it hasn't been tested. So some functions may import but not work

tron27 commented 1 year ago

Franklin, Thank you for responding! I tried the following code in a new Google Colab file.

!wget https://ml.informatik.uni-freiburg.de/~bergmane/scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl !pip uninstall --yes scipy !pip install ./scikit_learn-0.24.2-cp310-cp310-linux_x86_64.whl !pip install auto-sklearn==0.15.0 !pip install -U numpy==1.23.5

It seems that auto-sklearn version 0.15.0 installed fine. There was an error during the installation relating to incompatibility with scikit-learn, but scikit-learn 0.24.2 and scipy 1.11.2 were successfully installed. Therefore, I assume that this wasn't an issue. Check screenshot attached. Unfortuanately, when I tried to run the following command below:

import autosklearn.classification

I received an error (check attachment). I will test out your solution and get back to you asap! Thanks!

Error Installing Autosklearn 0 15 0 Error import autosklearn classification
tron27 commented 1 year ago

Franklin, I just tried your solution below (pure genius): !pip install git+https://github.com/Frankothe196/auto-sklearn.git@python3.10-added-compatibility

It definitely works! I can run the following lines of code without errors.

import autosklearn import autosklearn.classification

My goal is to apply AutoML in four lines of code! I realize that some functions may import, but not work. I'm keeping my fingers crossed on this. However, this is a great start. See attachments below.

Thank you very much, Eric

Auto-sklearn 0 15 0 Installed import autosklearn:autosklear classification
FranklinBarto commented 1 year ago

Happy to help @tron27! Are you experiencing any issues so far?

tron27 commented 1 year ago

@Frankothe196, Great to hear from you! The only issue, I'm facing is that currently, unless I'm mistaking, autosklearn doesn't handle Multiclass-Multioutput. I did a little digging, and this issue was brought up a couple of years ago (see below).

Missing multiclass-multioutput support #292

From #292 (now closed)! Matthias mentioned that when scikit-learn provides metrics to evaluate multioutput-multiclass predictions, there will be a way for autosklearn to work with multiclass-multioutput. I checked out the scikit-learn website regarding multiclass-multioutput capabilities. It appears as though scikit-learn has the capability to work with multiclass-multioutput. See the link below regarding this. It would be great if autosklearn provides this capability because I am interested in predicting longitude and latitude. Also, when I try to fit my X_train and y_train data, I get the error (see attachment). Worst case if autosklearn does not provide support for multiclass-multioutput, I'll just have to try some of the multiclass-multioutput algorithms from the scikit-learn website. Last, but not least, it appears as though the current autosklearn release only supports the first version of the AutoSklearnClassifier. Does it support the second version, AutoSklearn2Classifier? I don't think it does!

https://scikit-learn.org/stable/modules/multiclass.html

continuous multioutput error

Thanks, Eric