sherpa-ai / sherpa

Hyperparameter optimization that enables researchers to experiment, visualize, and scale quickly.
http://parameter-sherpa.readthedocs.io/
GNU General Public License v3.0
333 stars 54 forks source link

trying to implement sherpa with mongodb #56

Closed shreyagu closed 4 years ago

shreyagu commented 5 years ago

Aim

I'm trying to implement sherpa for parallel bayesian optimization, but just trying to run the sherpa/examples/parallel_example/simple.py throws the following errors:

Code

"""
SHERPA is a Python library for hyperparameter tuning of machine learning models.
Copyright (C) 2018  Lars Hertel, Peter Sadowski, and Julian Collado.

This file is part of SHERPA.

SHERPA is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

SHERPA is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with SHERPA.  If not, see <http://www.gnu.org/licenses/>.
"""
from __future__ import print_function
import tempfile
import os
import sherpa
import sherpa.schedulers
import argparse
import socket
import sherpa.algorithms.bayesian_optimization as bayesian_optimization

parser = argparse.ArgumentParser()
parser.add_argument('--env', help='Your environment path.',
                    default='/home/lhertel/profiles/python3env.profile', type=str)
FLAGS = parser.parse_args()
# figuring out host and queue
host = socket.gethostname()
sge_q = 'arcus.q' if (host.startswith('arcus-1') or host.startswith('arcus-2') or host.startswith('arcus-3') or host.startswith('arcus-4')) else 'arcus-ubuntu.q'

tempdir = tempfile.mkdtemp(dir=".")

parameters = [sherpa.Choice(name="param_a",
                            range=[1, 2, 3]),
              sherpa.Continuous(name="param_b",
                                range=[0, 1])]

algorithm = sherpa.algorithms.RandomSearch(max_num_trials=10)
# stopping_rule = sherpa.algorithms.MedianStoppingRule(min_iterations=2,
#                                           min_trials=3)
# algorithm = bayesian_optimization.GPyOpt(max_concurrent=4,
#                                          model_type='GP',
#                                          acquisition_type='EI',
#                                          max_num_trials=100)

# scheduler = sherpa.schedulers.SGEScheduler(submit_options="-N example -P arcus.p -q {} -l hostname='{}'".format(sge_q, host), environment=FLAGS.env, output_dir=tempdir)

scheduler = sherpa.schedulers.LocalScheduler()

### The *training script*
testscript = """import sherpa
import time

client = sherpa.Client(db_dir = "mongodb://localhost:27017/sherpa_sample/jobs/", port = 27017)
trial = client.get_trial()

# Simulate model training
num_iterations = 10
for i in range(num_iterations):
    pseudo_objective = trial.parameters['param_a'] / float(i + 1) * trial.parameters['param_b']
    time.sleep(1)
    client.send_metrics(trial=trial, iteration=i+1,
                        objective=pseudo_objective)
    # print("Trial {} Iteration {}.".format(trial.id, i+1))
# print("Trial {} finished.".format(trial.id))
"""
dbport =27017
filename = os.path.join(tempdir, "test.py")
with open(filename, 'w') as f:
    f.write(testscript)

results = sherpa.optimize(parameters=parameters,
                          algorithm=algorithm,
                          lower_is_better=True,
                          filename=filename,
                          output_dir=tempdir,
                          scheduler=scheduler,
                          max_concurrent=4,
                          verbose=1)

print(results)

Error Logs

INFO:sherpa.core:
 * Serving Flask app "sherpa.app.app" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 1:
    param_a        =                              2
    param_b        =             0.8522431562647138
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 2:
    param_a        =                              3
    param_b        =            0.29715833725982344
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 3:
    param_a        =                              2
    param_b        =           0.020480942550347048
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 4:
    param_a        =                              3
    param_b        =              0.596307236230985
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 5:
    param_a        =                              1
    param_b        =             0.4686961645249037
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 6:
    param_a        =                              2
    param_b        =            0.35682090172629666
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 7:
    param_a        =                              2
    param_b        =             0.6391827285407945
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 8:
    param_a        =                              3
    param_b        =             0.4170850070420954
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 9:
    param_a        =                              1
    param_b        =              0.250596506613348
-------------------------------------------------------

INFO:sherpa.core:
-------------------------------------------------------
Submitting Trial 10:
    param_a        =                              1
    param_b        =             0.4596871330102428
-------------------------------------------------------

INFO:sherpa.core:Optimization Algorithm finished.
INFO:sherpa.core:Optimization Algorithm finished.
Closing MongoDB!
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 2657, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Objective'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "simple.py", line 87, in <module>
    verbose=1)
  File "/usr/local/lib/python3.6/dist-packages/sherpa/core.py", line 636, in optimize
    return study.get_best_result()
  File "/usr/local/lib/python3.6/dist-packages/sherpa/core.py", line 254, in get_best_result
    self.lower_is_better)
  File "/usr/local/lib/python3.6/dist-packages/sherpa/algorithms/core.py", line 69, in get_best_result
    if lower_is_better
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 1494, in __getitem__
    return self._getitem_tuple(key)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 868, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 988, in _getitem_lowerdim
    section = self._getitem_axis(key, axis=i)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 1913, in _getitem_axis
    return self._get_label(key, axis=axis)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 141, in _get_label
    return self.obj._xs(label, axis=axis)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 3576, in xs
    return self[key]
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Objective'
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
    from apport.fileutils import likely_packaged, get_recent_crashes
  File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
    from apport.report import Report
  File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
    import apport.fileutils
  File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
    from apport.packaging_impl import impl as packaging
  File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 23, in <module>
    import apt
  File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
    import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'

Original exception was:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 2657, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Objective'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "simple.py", line 87, in <module>
    verbose=1)
  File "/usr/local/lib/python3.6/dist-packages/sherpa/core.py", line 636, in optimize
    return study.get_best_result()
  File "/usr/local/lib/python3.6/dist-packages/sherpa/core.py", line 254, in get_best_result
    self.lower_is_better)
  File "/usr/local/lib/python3.6/dist-packages/sherpa/algorithms/core.py", line 69, in get_best_result
    if lower_is_better
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 1494, in __getitem__
    return self._getitem_tuple(key)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 868, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 988, in _getitem_lowerdim
    section = self._getitem_axis(key, axis=i)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 1913, in _getitem_axis
    return self._get_label(key, axis=axis)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexing.py", line 141, in _get_label
    return self.obj._xs(label, axis=axis)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py", line 3576, in xs
    return self[key]
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py", line 2659, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Objective'
LarsHH commented 4 years ago

Closing this after direct conversation with author of the issue.

ggantos commented 4 years ago

Hello! I'm getting this very same error running simple.py. Any chance you can post what resolved the issue? Thanks!

LarsHH commented 4 years ago

Hi @ggantos! I just pulled from master and ran this and it worked. Do you mind doing the following:

1) pasting the output from running the script. Is it exactly as above? 2) check if there was a tmp* folder created in the folder where simple.py is. If yes, could you check if there is a results.csv and if it is non-empty? 3) can you check what pandas version you have?

Best, Lars

ggantos commented 4 years ago

Hello! Thank you for the reply I was trying to fix things myself and figured out that I needed to conda deactivate base in my batch script after source ~/.bashrc. Then conda activate sherpa environment. Thanks for your reply and help!

bluevex commented 4 years ago

@ggantos @LarsHH Can you explain this more, like what is different in your sherpa environment? I am facing the same issue; KeyError: 'Objective'. No data displayed on dashboard. Job output for trial 1:


Traceback (most recent call last):
  File "./tmpmiwdolg5/test.py", line 5, in <module>
    trial = client.get_trial()

  File "/home/user/miniconda3/lib/python3.7/site-packages/sherpa/database.py", line 222, in get_trial
    t = next(g)

  File "/home/user/miniconda3/lib/python3.7/site-packages/sherpa/database.py", line 221, in <genexpr>
    g = (entry for entry in self.db.trials.find({'trial_id': trial_id}))

  File "/home/user/miniconda3/lib/python3.7/site-packages/pymongo/cursor.py", line 1207, in next
    if len(self.__data) or self._refresh():

  File "/home/user/miniconda3/lib/python3.7/site-packages/pymongo/cursor.py", line 1100, in _refresh
    self.__session = self.__collection.database.client._ensure_session()

  File "/home/vex/miniconda3/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1816, in _ensure_session
    return self.__start_session(True, causal_consistency=False)

  File "/home/user/miniconda3/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1766, in __start_session
    server_session = self._get_server_session()

  File "/home/user/miniconda3/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1802, in _get_server_session
    return self._topology.get_server_session()

  File "/home/user/miniconda3/lib/python3.7/site-packages/pymongo/topology.py", line 488, in get_server_session
    None)

  File "/home/user/miniconda3/lib/python3.7/site-packages/pymongo/topology.py", line 217, in _select_servers_loop
    (self._error_message(selector), timeout, self.description))

pymongo.errors.ServerSelectionTimeoutError: hostname:27001: [Errno 111] Connection refused, Timeout: 30s, Topology Description: <TopologyDescription id: 5f3c4212cde0a5c7878ddbe9, topology_type: Single, servers: [<ServerDescription ('openmind7', 27001) server_type: Unknown, rtt: None, error=AutoReconnect('hostname:27001: [Errno 111] Connection refused')>]>

non-parallel simple.py works fine with the dashboard etc. I'm using mongo db version v4.0.3 because thats whats installed on the cluster I'm using.

ggantos commented 4 years ago

Hi @bluevex - rather than build an environment from what I was using in my other repos, I installed via Sherpa's setup.py and only modified module versions to get other libraries in my workflow to function. I also deactivated conda's base environment before activating my sherpa environment. I'm honestly not sure about your error but these two changes worked for me. I hope that helps!

LarsHH commented 4 years ago

@bluevex I might for now recommend to clone from github and install sherpa using pip install -e . from the cloned sherpa root dir. That's because pypi is one version behind the github version due to deployment issues.