HDI-Project / ATM

Auto Tune Models - A multi-tenant, multi-data system for automated machine learning (model selection and tuning).
https://hdi-project.github.io/ATM/
MIT License
525 stars 141 forks source link

Quickstart failing related to BTB #119

Closed RogerTangos closed 5 years ago

RogerTangos commented 5 years ago

I've run through the quickstart and am having trouble tuning models. It seems that there is an issue with unpickling hyperpartitions. @micahjsmith says that this may have to do with https://github.com/HDI-Project/BTB/issues/96

$ python scripts/enter_data.py
...
$ python scripts/worker.py

Computing on datarun 1
Selector: <class 'btb.selection.uniform.Uniform'>
Tuner: <class 'btb.tuning.uniform.Uniform'>
Error choosing hyperparameters: datarun=<ID = 1, dataset ID = 1, strategy = uniform__uniform, budget = classifier (100), status: running>
Traceback (most recent call last):
  File "/Users/arcarter/.virtualenvs/test/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/worker.py", line 379, in run_classifier
    params = self.tune_hyperparameters(hyperpartition)
  File "/Users/arcarter/.virtualenvs/test/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/worker.py", line 160, in tune_hyperparameters
    tunables = hyperpartition.tunables
  File "/Users/arcarter/.virtualenvs/test/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/database.py", line 214, in tunables
    return base_64_to_object(self.tunable_hyperparameters_64)
  File "/Users/arcarter/.virtualenvs/test/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/utilities.py", line 86, in base_64_to_object
    return pickle.loads(decoded)
  File "/Users/arcarter/.virtualenvs/test/lib/python3.6/site-packages/btb/hyper_parameter.py", line 51, in __new__
    raise ValueError('Invalid param type {}'.format(param_type))
ValueError: Invalid param type None

My environment:

Pip freeze:

baytune==0.2.2
boto==2.49.0
certifi==2018.10.15
chardet==3.0.4
future==0.17.1
idna==2.7
joblib==0.13.0
mysqlclient==1.3.13
numpy==1.15.4
pandas==0.23.4
python-dateutil==2.7.5
pytz==2018.7
PyYAML==3.13
requests==2.20.1
scikit-learn==0.20.0
scipy==1.1.0
six==1.11.0
sklearn-pandas==1.7.0
SQLAlchemy==1.2.14
urllib3==1.24.1
RogerTangos commented 5 years ago

Interestingly, this also happens, even when I go back to my formerly-working branch and downgrade to baytune==0.1.2

micahjsmith commented 5 years ago

We patched btb to baytune==0.2.3 which I thought would have solved the problem. Now I am getting:

Something went wrong. Sleeping 1 seconds.                                                                                                                  
Computing on datarun 1     
Selector: <class 'btb.selection.uniform.Uniform'>
Tuner: <class 'btb.tuning.uniform.Uniform'>
Error choosing hyperparameters: datarun=<ID = 1, dataset ID = 1, strategy = uniform__uniform, budget = classifier (100), status: running>
Traceback (most recent call last):
  File "/Users/micahsmith/workspace/atm/atm/worker.py", line 379, in run_classifier
    params = self.tune_hyperparameters(hyperpartition)
  File "/Users/micahsmith/workspace/atm/atm/worker.py", line 160, in tune_hyperparameters
    tunables = hyperpartition.tunables                                   
  File "/Users/micahsmith/workspace/atm/atm/database.py", line 214, in tunables
    return base_64_to_object(self.tunable_hyperparameters_64)
  File "/Users/micahsmith/workspace/atm/atm/utilities.py", line 86, in base_64_to_object
    return pickle.loads(decoded)           
AttributeError: 'NoneType' object has no attribute '__dict__'
RogerTangos commented 5 years ago

After some digging, it's I've realized that what's being set in database.py @tunables.setter isn't compatible with the new BTB hyper_parameter

This is being set from atm/enter_data.py, and has some baytune objects: [('C', <btb.hyper_parameter.FloatExpHyperParameter object at 0x1117aec50>), ('tol', <btb.hyper_parameter.FloatExpHyperParameter object at 0x1117aebe0>)], where the data is being loaded via atm/method.py from some some methods/*.json files.

Do the methods/*.json files need to be updated for btb 2.0+?


Sidenote: When debugging, it's very helpful to delete atm.db. Since the hyperpartitions are encoded as base64 and written into the database, it's possible to have wrong versions in the DB.

micahjsmith commented 5 years ago

After some digging, it's I've realized that what's being set in database.py @tunables.setter isn't compatible with the new BTB hyper_parameter

I don't understand how it is not compatible? (I don't think there are any compatibility issues between btb v1/btb v2. Certainly if there are serialized versions of btb v1 hyperparameters in the database, those need to be cleared and cannot be used with a version of atm that uses btb v2.)

Sidenote: When debugging, it's very helpful to delete atm.db. Since the hyperpartitions are encoded as base64 and written into the database, it's possible to have wrong versions in the DB.

Agreed, I have been using make clean before running the scripts.

RogerTangos commented 5 years ago

What's being set by the @tunables.setter can't be decoded by the tunables property. It's a bit hard to see what's going on there (since I'm not familiar with BTB, and since things are base64 encoded).

I'm not sure if that helps (sorry if it doesn't!). For now, I'm able to get things working using BTB 0.1.2.

pvk-developer commented 5 years ago

Hi! I'm trying to follow the QuickStart and I'm having the same issues as @micahjsmith https://github.com/HDI-Project/ATM/issues/119#issuecomment-438529629

The command that I try to run is: python scripts/worker.py The traceback error that I'm getting is:

$ python scripts/worker.py 
Computing on datarun 1
Selector: <class 'btb.selection.uniform.Uniform'>
Tuner: <class 'btb.tuning.uniform.Uniform'>
Error choosing hyperparameters: datarun=<ID = 1, dataset ID = 1, strategy = uniform__uniform, budget = classifier (100), status: running>
Traceback (most recent call last):
  File "/virtualenvs/ATM/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/worker.py", line 379, in run_classifier
    params = self.tune_hyperparameters(hyperpartition)
  File "/virtualenvs/ATM/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/worker.py", line 160, in tune_hyperparameters
    tunables = hyperpartition.tunables
  File "/virtualenvs/ATM/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/database.py", line 214, in tunables
    return base_64_to_object(self.tunable_hyperparameters_64)
  File "/virtualenvs/ATM/lib/python3.6/site-packages/atm-0.1.0-py3.6.egg/atm/utilities.py", line 86, in base_64_to_object
    return pickle.loads(decoded)
AttributeError: 'NoneType' object has no attribute '__dict__'
csala commented 5 years ago

This has been already fixed in the latest v0.1.1 release, so I'm closing this issue.