hyperopt / hyperopt

Distributed Asynchronous Hyperparameter Optimization in Python
http://hyperopt.github.io/hyperopt
Other
7.2k stars 1.05k forks source link

How to extract n-th best hyper-parameters from trials object? #620

Closed consti-91 closed 8 months ago

consti-91 commented 4 years ago

I wonder how I can extract the n-th best (e.g. second best or third best) model or hyper-parameters, respectively, from the trials database object?

I started with the following function that i've found here:

def nthBestModelFromTrials(trials,     # trials database object
                           rank = 2):  # n-th best model, in this case the third best model

    # filter out trials with status non-ok
    valid_trial_list = [trial for trial in trials
                            if STATUS_OK == trial['result']['status']]

    # get the losses of the remaining trials
    losses = [float(trial['result']['loss']) for trial in valid_trial_list]

    # get index of the n-th best model
    index_having_nth_minumum_loss = losses.index(sorted(losses)[rank])

    # get results data from the n-th best model object
    best_trial_obj = valid_trial_list[index_having_nth_minumum_loss ]

The best_trial_obj gives me the following dictionary:


 {
 'state': 2,
 'tid': 4,
 'spec': None,
 'result': {'loss': 2339973.0105,
  'status': 'ok',
  'validation_technique_used': 'validation set 30% of newest data'},
 'misc': {'tid': 4,
  'cmd': ('domain_attachment', 'FMinIter_Domain'),
  'workdir': None,
  'idxs': {'D': [1],
   'P': [0],
   'Q': [0],
   'changepoint_prior_scale': [],
   'classifier': [4],
   'd': [1],
   'external_regressors_prophet': [],
   'external_regressors_sarimax': [4],
   'growth_type': [],
   'p': [3],
   'q': [2],
   'regressor_prior_scale': [],
   's': [12],
   'yearly_fourier_order': [],
   'yearly_seasonality_prior_scale': []},
  'vals': {'D': [1],
   'P': [1],
   'Q': [1],
   'changepoint_prior_scale': [],
   'classifier': [1],
   'd': [1],
   'external_regressors_prophet': [],
   'external_regressors_sarimax': [0],
   'growth_type': [],
   'p': [1],
   'q': [2],
   'regressor_prior_scale': [],
   's': [0],
   'yearly_fourier_order': [],
   'yearly_seasonality_prior_scale': []}},
 'exp_key': None,
 'owner': None,
 'version': 0,
 'book_time': datetime.datetime(2020, 2, 10, 16, 25, 35, 277000),
 'refresh_time': datetime.datetime(2020, 2, 10, 16, 25, 35, 499000)
 }

So knowing for example the "tid" of the third best model now, I wonder how I can get the format that I receive from fmin function for the best model, which looks like this:

{'D': 1,
 'P': 0,
 'Q': 0,
 'classifier': 1,
 'd': 1,
 'external_regressors_sarimax': 0,
 'p': 3,
 'q': 2,
 's': 12}

Also, I'm confused about "idxs" and "vals" - what's the difference here? Thank you so much in advance!

skylogic004 commented 3 years ago

"idxs" can be ignored; the docstring for Trials defines what it is:

The idxs dictionary is technically redundant -- it is the same as vals but it maps hyperparameter names to either [] or [<tid>].

As for "vals", here's what it says:

The vals dictionary is a sub-sub-dictionary mapping each hyperparameter to either [] (if the hyperparameter is inactive in this trial), or [<val>] (if the hyperparameter is active).

Now, to get back values in the same form as the hyperparameter space, we can do what the fmin function does.

First, vals is a bunch of 1-element arrays which need to be unpacked:

# this code is a slightly modified version of hyperopt.base.Trials.argmin
def unpack_values(trial):
    vals = trial["misc"]["vals"]
    # unpack the one-element lists to values
    # and skip over the 0-element lists
    rval = {}
    for k, v in list(vals.items()):
        if v:
            rval[k] = v[0]
    return rval

vals = unpack_values(best_trial_obj)

Second, we convert this against the hyperparameter space. space is the thing you passed to the fmin function.

from hyperopt.fmin import space_eval
best_values = space_eval(space, vals)

I believe best_values is what you are looking for.

ArslanKAS commented 1 year ago

"idxs" can be ignored; the docstring for Trials defines what it is:

The idxs dictionary is technically redundant -- it is the same as vals but it maps hyperparameter names to either [] or [<tid>].

As for "vals", here's what it says:

The vals dictionary is a sub-sub-dictionary mapping each hyperparameter to either [] (if the hyperparameter is inactive in this trial), or [<val>] (if the hyperparameter is active).

Now, to get back values in the same form as the hyperparameter space, we can do what the fmin function does.

First, vals is a bunch of 1-element arrays which need to be unpacked:

# this code is a slightly modified version of hyperopt.base.Trials.argmin
def unpack_values(trial):
    vals = trial["misc"]["vals"]
    # unpack the one-element lists to values
    # and skip over the 0-element lists
    rval = {}
    for k, v in list(vals.items()):
        if v:
            rval[k] = v[0]
    return rval

vals = unpack_values(best_trial_obj)

Second, we convert this against the hyperparameter space. space is the thing you passed to the fmin function.

from hyperopt.fmin import space_eval
best_values = space_eval(space, vals)

I believe best_values is what you are looking for.

You're awesome.

github-actions[bot] commented 9 months ago

This issue has been marked as stale because it has been open 120 days with no activity. Remove the stale label or comment or this will be closed in 30 days.