hyperopt / hyperopt-sklearn

Hyper-parameter optimization for sklearn
hyperopt.github.io/hyperopt-sklearn
Other
1.57k stars 270 forks source link

showstopper issue - AttributeError: 'hyperopt_estimator' object has no attribute 'classifier' #168

Open snaraya7 opened 3 years ago

snaraya7 commented 3 years ago

Thank You for hyperopt-sklearn it is a great package.

Recently, even this basic code fails:

from hpsklearn import HyperoptEstimator, any_classifier, any_preprocessing
from hyperopt import tpe

estim = HyperoptEstimator(classifier=any_classifier('my_clf'),
                          preprocessing=any_preprocessing('my_pre'),
                          algo=tpe.suggest,
                          max_evals=100,
                          trial_timeout=120)

print(estim)
 File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\base.py", line 260, in __repr__
    repr_ = pp.pformat(self)
  File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\Lib\pprint.py", line 144, in pformat
    self._format(object, sio, 0, 0, {}, 0)
  File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\Lib\pprint.py", line 161, in _format
    rep = self._repr(object, context, level)
  File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\Lib\pprint.py", line 393, in _repr
    self._depth, level)
  File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\_pprint.py", line 181, in format
    changed_only=self._changed_only)
  File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\_pprint.py", line 425, in _safe_repr
    params = _changed_params(object)
  File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\_pprint.py", line 91, in _changed_params
    params = estimator.get_params(deep=False)
  File "C:\Users\ncshr\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\base.py", line 195, in get_params
    value = getattr(self, key)
AttributeError: 'hyperopt_estimator' object has no attribute 'classifier'

Windows 10 machine, Python 3.7.1, hpsklearn 0.1.0, networkx-2.5 Please suggest

bjkomer commented 3 years ago

My guess would be you have an older version of hpsklearn. Have you tried installing the one from master? The version I get when I do that is 0.0.3 which is different than yours.

snaraya7 commented 3 years ago

Thank you for your suggestion. But installation from a git clone failed with

Cloning into 'hyperopt-sklearn'...
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists. 

Tried both a) cloning master from https://github.com/hyperopt/hyperopt-sklearn

and b) pip install hpsklearn==0.0.3 and still getting the same AttributeError: 'hyperopt_estimator' object has no attribute 'classifier' error. Further, my IDE is showing an option to upgrade to the latest 0.1.0, and that currently, now I have is 0.0.3

mgbckr commented 3 years ago

Hi @snaraya7, for cloning master, maybe try via https rather than ssh (git clone https://github.com/hyperopt/hyperopt-sklearn.git). That worked for me.

snaraya7 commented 3 years ago

@mgbckr Thank You! Unfortunately receiving the same error. I am using Python 3.7.1. Tried installing as well as referring the package separately in IDE, but neither works.

mgbckr commented 3 years ago

@snaraya7 Just started playing with it, too, and I actually get the same issue :D

So the issue lies within the __init__ method which does not abide by the sklearn rules asking for "dumb" constructor that just copies the parameters to class variables. Specifically, classifier (among others) is not copied and thus when calling get_params things break since the corresponding sklearn functions looks for the local variables specified by the constructor.

snaraya7 commented 3 years ago

Yay! thanks for debugging, please let me know if you have a fix. Looking into your explanation.

mgbckr commented 3 years ago

Tried to come up with a solution. Please test if you get a chance.

snaraya7 commented 3 years ago

Yes, not fully resolved. Are you able to run (estimate) a best model for some dataset ? Getting newer issues with components and freeze_support().

estim = HyperoptEstimator( classifier=any_sparse_classifier('clf'), 
                            preprocessing=[tfidf('tfidf')],
                            algo=tpe.suggest, trial_timeout=300)

estim.fit( X_train, y_train )

print( estim.score( X_test, y_test ) )
print( estim.best_model() )
mgbckr commented 3 years ago

Works fine for me it seems. If you can provide a minimal example and an error message, I'd try to have a look if I can find the time.

from hpsklearn import HyperoptEstimator, any_classifier, any_preprocessing
from hyperopt import tpe
import sklearn.datasets

e = HyperoptEstimator(
    classifier=any_classifier('my_clf'),
  preprocessing=any_preprocessing('my_pre'),
  algo=tpe.suggest,
  max_evals=2,
  trial_timeout=5)
# e.get_params()

X, y = sklearn.datasets.make_classification()
e.fit(X,y)
e.predict(X)
print(e.score(X, y))
print(e.best_model())
bjkomer commented 3 years ago

Had some time to look into this a bit more, and it seems the freeze_support error is a windows specific issue. You can try the solution here: https://stackoverflow.com/questions/24374288/where-to-put-freeze-support-in-a-python-script/24374798#24374798

some answers are saying you may also need to add a freeze_support line explicitly like this:

from multiprocessing import freeze_support

if __name__ == '__main__':
    freeze_support()
    main()

This looks to be the same issue as #72 and #163

hjort commented 3 years ago

Same issue on Ubuntu Linux with Python 3.6 on Anaconda:

$ python Python 3.6.7 |Anaconda custom (32-bit)| (default, Oct 23 2018, 19:27:27) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information.

from hpsklearn import HyperoptEstimator, any_classifier, any_preprocessing WARN: OMP_NUM_THREADS=None => ... If you are using openblas if you are using openblas set OMP_NUM_THREADS=1 or risk subprocess calls hanging indefinitely from hyperopt import tpe

estim = HyperoptEstimator(classifier=any_classifier('my_clf'), ... preprocessing=any_preprocessing('my_pre'), ... algo=tpe.suggest, ... max_evals=100, ... trial_timeout=120)

print(estim) Traceback (most recent call last): File "", line 1, in File "/opt/anaconda3/lib/python3.6/site-packages/sklearn/base.py", line 260, in repr repr_ = pp.pformat(self) File "/opt/anaconda3/lib/python3.6/pprint.py", line 144, in pformat self._format(object, sio, 0, 0, {}, 0) File "/opt/anaconda3/lib/python3.6/pprint.py", line 161, in _format rep = self._repr(object, context, level) File "/opt/anaconda3/lib/python3.6/pprint.py", line 393, in _repr self._depth, level) File "/opt/anaconda3/lib/python3.6/site-packages/sklearn/utils/_pprint.py", line 181, in format changed_only=self._changed_only) File "/opt/anaconda3/lib/python3.6/site-packages/sklearn/utils/_pprint.py", line 425, in _safe_repr params = _changed_params(object) File "/opt/anaconda3/lib/python3.6/site-packages/sklearn/utils/_pprint.py", line 91, in _changed_params params = estimator.get_params(deep=False) File "/opt/anaconda3/lib/python3.6/site-packages/sklearn/base.py", line 195, in get_params value = getattr(self, key) AttributeError: 'hyperopt_estimator' object has no attribute 'classifier'

I've installed it through: pip install git+https://github.com/hyperopt/hyperopt-sklearn

Any thoughts on how to solve this?

mgbckr commented 3 years ago

Checkout my pull request mentioned above #169 . That works for me at least. Maybe you can give it a try.