facebook / Ax

Adaptive Experimentation Platform
https://ax.dev
MIT License
2.36k stars 307 forks source link

There's a limit of 1112 trials on single-objective experiments, 556 on two-objective #291

Closed mrcslws closed 4 years ago

mrcslws commented 4 years ago

I'm hitting a hard limit on the number of trials in an experiment. With a single-metric objective this limit is 1112 trials. With a two-metric MultiObjective, the limit is 556. In all these cases, I get error:

UnsupportedError:  SobolQMCSampler only supports dimensions `q * o <= 1111`.

This MultiObjective trend will continue, 3-objective models will have a limit of 1112/3 trials, etc.

There are two separate lines of questions here:

  1. (Am I doing something silly?)
  2. Is the limit of 1112 trials known? Is it By Design?
  3. Is there a good reason that the multi-objective case has a lower limit? In hyperparameter search you generally need more trials for multi-objective experiments, not fewer.

Single-objective demo

from ax.service.ax_client import AxClient

PARAMETERS = [
    {"name": "x", "type": "range", "bounds": [12.2, 602.2], "value_type": "float"},
]

def example_f(x):
    # Distance from a multiple of 33
    mod33 = x - (x // 33) * 33
    if mod33 > 33 // 2:
        return 33 - mod33
    else:
        return mod33

NUM_TRIALS = 1200
ax_client = AxClient()
ax_client.create_experiment(
    parameters=PARAMETERS,
    choose_generation_strategy_kwargs=dict(
        # Random trials aren't necessary, but they're faster for demo purposes.
        num_initialization_trials=1112,
    ),
    minimize=True,
)

for i in range(NUM_TRIALS):
    parameters, trial_index = ax_client.get_next_trial()
    ax_client.complete_trial(trial_index=trial_index,
                             raw_data=example_f(parameters["x"]))

Result

The Botorch model doesn't work after there are more than 1111 results.

  File "/site-packages/ax/modelbridge/generation_strategy.py", line 300, in gen
    keywords=get_function_argument_names(model.gen),
  File "/site-packages/ax/modelbridge/base.py", line 616, in gen
    model_gen_options=model_gen_options,
  File "/site-packages/ax/modelbridge/array.py", line 212, in _gen
    target_fidelities=target_fidelities,
  File "/site-packages/ax/modelbridge/torch.py", line 214, in _model_gen
    target_fidelities=target_fidelities,
  File "/site-packages/ax/models/torch/botorch.py", line 363, in gen
    **acf_options,
  File "/site-packages/ax/models/torch/botorch_defaults.py", line 211, in get_NEI
    seed=torch.randint(1, 10000, (1,)).item(),
  File "/site-packages/botorch/acquisition/utils.py", line 96, in get_acquisition_function
    prune_baseline=kwargs.get("prune_baseline", False),
  File "/site-packages/botorch/acquisition/monte_carlo.py", line 221, in __init__
    model=model, X=X_baseline, objective=objective
  File "/site-packages/botorch/acquisition/utils.py", line 219, in prune_inferior_points
    samples = sampler(posterior)
  File "/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/site-packages/botorch/sampling/samplers.py", line 57, in forward
    self._construct_base_samples(posterior=posterior, shape=base_sample_shape)
  File "/site-packages/botorch/sampling/samplers.py", line 234, in _construct_base_samples
    "SobolQMCSampler only supports dimensions "
botorch.exceptions.errors.UnsupportedError: SobolQMCSampler only supports dimensions `q * o <= 1111`. Requested: 1112

I tried setting "acquisition_function_kwargs" prune_baseline=False, but the error still eventually occurs on another codepath. So I haven't found a workaround.

Balandat commented 4 years ago

Hi, thanks for the great repro.

(Am I doing something silly?)

No.

Is the limit of 1112 trials known? Is it By Design?

It's a limitation of the quasirandom generator in pytorch we use under the hood. It's not by design - it's an upstream limitation, the sampler only works until dimension 1111. While it's possible to change that, that would be a lot of work, and so far hasn't been a priority.

Taking a step back, the sampler is used for drawing qMC samples from a high-dimensional posterior distribution of a Gaussian Process model, which is an operation performed in the qNoisyExpectedImprovement acquisition function. The prune_baseline=True that you mentioned is designed to help speeding up things to only consider those trials that are / could be relevant to optimize further. Looking at the code, it seems that we don't handle the case when the input to that function are a lot of trials. I can put up a PR in botorch to fix this.

Is there a good reason that the multi-objective case has a lower limit? In hyperparameter search you generally need more trials for multi-objective experiments, not fewer.

What's important here is the dimensionality of the distribution from which you sample. Since this distribution is in the outcome space, the dimension of that is q * m where q is the number of points considered jointly and m is the dimensionality of the output. So for the sampler to work you need n * q < 1111.

I tried setting "acquisition_function_kwargs" prune_baseline=False, but the error still eventually occurs on another codepath. So I haven't found a workaround.

Interesting, this shouldn't happen, will also take a look there. In this case you should definitely not need set prune_baseline to False, that will slow things down significantly.