How to include extended features (e.g. state variables like temperature or pressure) in a restricted search space: "Fixed parameter ... with out of design value"

facebook / Ax

Adaptive Experimentation Platform

https://ax.dev

MIT License

2.38k stars 311 forks source link

How to include extended features (e.g. state variables like temperature or pressure) in a restricted search space: "Fixed parameter ... with out of design value" #905

Closed sgbaird closed 2 years ago

sgbaird commented 2 years ago

The model requires certain features in order to obtain a good fit, but these parameters are not fixed, nor are they necessarily something that makes sense to be optimized for (especially in the existing data #743).

An example would be Vickers hardness measurements scraped from scientific literature. The Vickers hardness depends on the applied load of the Vickers tip, which varies from publication to publication (in this case, from experience - removing the applied load results in almost no predictive performance). However, if provided to the algorithm as a search parameter, it will naturally suggest minimizing the applied load, because Vickers hardness is (almost?) always higher for smaller loads. This same thing appears in many materials science & chemistry optimization problems, including several of the projects I'm working on.

Phrased another way, the model benefits from additional features (appended, not replacing), but only optimizes over the original features. The FixedParameter class comes to mind, but I don't think the intention was for a FixedParameter to vary for every datapoint. The discussion in https://github.com/facebook/Ax/issues/773 related to sequential search spaces also comes to mind, but after attempting to use this, I get:

Fixed parameter size with out of design value: 5.0 passed to `RemoveFixed`.
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\transforms\remove_fixed.py]()", line 51, in transform_observation_features
    raise ValueError(
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py]()", line 217, in _transform_data
    obs_feats = t_instance.transform_observation_features(obs_feats)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py]()", line 170, in __init__
    obs_feats, obs_data, search_space = self._transform_data(
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\torch.py]()", line 99, in __init__
    super().__init__(
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\registry.py]()", line 342, in __call__
    model_bridge = bridge_class(
  File "[C:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\axforchemistry\utils\experiment.py]()", line 61, in generate_sobol_experiments
    model = BayesModel(
  File "[C:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\tutorials\multiple_constraints.py]()", line 183, in <module>
    y_preds, y_stds, next_experiments, trial_indices, model = generate_sobol_experiments(
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 87, in _run_code
    exec(code, run_globals)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 87, in _run_code
    exec(code, run_globals)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 197, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,

(call stack)

transform_observation_features (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\transforms\remove_fixed.py:51)
_transform_data (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py:217)
__init__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py:170)
__init__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\torch.py:99)
__call__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\registry.py:342)
generate_sobol_experiments (c:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\axforchemistry\utils\experiment.py:61)
<module> (c:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\tutorials\multiple_constraints.py:183)
_run_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:87)
_run_module_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:97)
run_path (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:268)
_run_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:87)
_run_module_as_main (Current frame) (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:197)

So I guess the idea of using FixedParameter doesn't pan out because of the use of RemoveFixed (https://ax.dev/docs/models.html#transforms). Maybe a way to get around this is to set narrow bounds in a new search space for a RangeParameter or set a ChoiceParameter with only one choice (not sure if the latter would throw an error).

sgbaird commented 2 years ago

xref

https://github.com/facebook/Ax/issues/772

sgbaird commented 2 years ago

It looks like a narrow RangeParameter runs without error (I set model_kwargs = {"fit_out_of_design": True, "fallback_to_sample_polytope": True} and model_kwargs = {"fit_out_of_design": True} for Sobol and Bayes, respectively, in a custom generation strategy), but then I run into another issue with a ChoiceParameter:

y contains previously unseen labels: 'poly1 & poly2'
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\utils\_encode.py]()", line 182, in _encode
    return _map_to_integer(values, uniques)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\utils\_encode.py]()", line 126, in _map_to_integer
    return np.array([table[v] for v in values])
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\utils\_encode.py]()", line 126, in <listcomp>
    return np.array([table[v] for v in values])
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\utils\_encode.py]()", line 120, in __missing__
    raise KeyError(key)

During handling of the above exception, another exception occurred:

  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\utils\_encode.py]()", line 184, in _encode
    raise ValueError(f"y contains previously unseen labels: {str(e)}")
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\preprocessing\_label.py]()", line 138, in transform
    return _encode(y, uniques=self.classes_)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\transforms\one_hot.py]()", line 43, in transform
    return self.label_binarizer.transform(self.int_encoder.transform(labels))
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\transforms\one_hot.py]()", line 119, in transform_observation_features
    vals = encoder.transform(labels=[obsf.parameters.pop(p_name)])[0]
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py]()", line 217, in _transform_data
    obs_feats = t_instance.transform_observation_features(obs_feats)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py]()", line 170, in __init__
    obs_feats, obs_data, search_space = self._transform_data(
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\torch.py]()", line 99, in __init__
    super().__init__(
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\registry.py]()", line 342, in __call__
    model_bridge = bridge_class(
  File "[C:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\axforchemistry\utils\experiment.py]()", line 61, in generate_sobol_experiments
    model = BayesModel(
  File "[C:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\tutorials\multiple_constraints.py]()", line 193, in <module>
    y_preds, y_stds, next_experiments, trial_indices, model = generate_sobol_experiments(
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 87, in _run_code
    exec(code, run_globals)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 268, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 87, in _run_code
    exec(code, run_globals)
  File "[C:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py]()", line 197, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,

(call stack)

_encode (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\utils\_encode.py:184)
transform (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\sklearn\preprocessing\_label.py:138)
transform (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\transforms\one_hot.py:43)
transform_observation_features (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\transforms\one_hot.py:119)
_transform_data (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py:217)
__init__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py:170)
__init__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\torch.py:99)
__call__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\registry.py:342)
generate_sobol_experiments (c:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\axforchemistry\utils\experiment.py:61)
<module> (c:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\tutorials\multiple_constraints.py:193)
_run_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:87)
_run_module_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:97)
run_path (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:268)
_run_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:87)
_run_module_as_main (Current frame) (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:197)

(not sure if you prefer call stacks or the stack traces).

sgbaird commented 2 years ago

If I try to work around ChoiceParameter by hard-coding ChoiceParameter into p.values: https://github.com/facebook/Ax/blob/f7cdc53d8d6e7e83b1fc3d7d07accfc252a66f82/ax/modelbridge/transforms/one_hot.py#L103 e.g. via:

if p.name == "poly_type":
    self.encoder[p.name] = OneHotEncoder(
        [
            "poly1",
            "poly2",
            "poly3",
            "poly4",
            "poly5",
            "poly6",
            "poly7",
            "poly8",
            "poly1 & poly2",
        ]
    )
else:
    self.encoder[p.name] = OneHotEncoder(p.values)

it runs without error, but didn't fix the issue since it still suggests candidates that are supposed to be inaccessible. It also then produces a new error related (I think) to too narrow of bounds in the hacky workaround with RangeParameter (I used +/- 1e-3):

torch.linalg_cholesky: The factorization could not be completed because the input is not positive-definite (the leading minor of order 46 is not positive-definite). Trace Shapes:
Param Sites:
Sample Sites:
outputscale dist |
value |
mean dist |
value |
noise dist |
value |
kernel_tausq dist |
value |
_kernel_inv_length_sq dist 12 |
value 12 |
kernel_inv_length_sq dist | 12 value | 12 lengthscale dist | 12 value | 12

__init__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\torch\distributions\multivariate_normal.py:151)
__call__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\distributions\distribution.py:18)
single_task_pyro_model (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\models\torch\fully_bayesian.py:238)
__call__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\poutine\trace_messenger.py:180)
__call__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\poutine\trace_messenger.py:180)
get_trace (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\poutine\trace_messenger.py:198)
_guess_max_plate_nesting (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\infer\mcmc\util.py:251)
initialize_model (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\infer\mcmc\util.py:427)
_initialize_model_properties (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\infer\mcmc\hmc.py:259)
setup (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\infer\mcmc\hmc.py:325)
_gen_samples (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\infer\mcmc\api.py:144)
run (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\infer\mcmc\api.py:223)
run (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\infer\mcmc\api.py:563)
_context_wrap (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\pyro\poutine\messenger.py:12)
run_inference (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\models\torch\fully_bayesian.py:407)
_get_model_mcmc_samples (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\models\torch\fully_bayesian.py:302)
get_and_fit_model_mcmc (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\models\torch\fully_bayesian.py:346)
fit (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\models\torch\botorch.py:296)
_model_fit (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\torch.py:197)
_fit (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\array.py:103)
_fit (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\torch.py:131)
__init__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\base.py:181)
__init__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\torch.py:99)
__call__ (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\site-packages\ax\modelbridge\registry.py:342)
generate_sobol_experiments (c:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\axforchemistry\utils\experiment.py:61)
<module> (c:\Users\sterg\Documents\GitHub\sparks-baird\AxForChemistry\tutorials\multiple_constraints.py:194)
_run_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:87)
_run_module_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:97)
run_path (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:268)
_run_code (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:87)
_run_module_as_main (Current frame) (c:\Users\sterg\miniconda3\envs\axforchemistry\Lib\runpy.py:197)

but when I loosened the tolerance (e.g. to +/-0.1, when the data bounds are [0.25, 10], the error seems to go away.

sgbaird commented 2 years ago

@lena-kashtelyan, I had come across https://github.com/facebook/Ax/issues/746 a while back but didn't realize the similiarity due to the difference in language (state variables / extended features == contextual). If I'm interpreting that issue correctly, I think the desired outcome is identical. If so, then I guess this serves as a sciency translation of #746 with some poor/non-functional "workarounds" 😅. Will keep https://github.com/facebook/Ax/issues/733#issue-1066610439 + https://github.com/facebook/Ax/issues/746#issue-1073276364 in mind as the best workaround for now.

lena-kashtelyan commented 2 years ago

Sorry for the delay on this @sgbaird, I will get back to you tomorrow!

Balandat commented 2 years ago

So this sounds very much like a contextual problem; let's see if I we can make that a bit more concrete (and mathy).

The way I understand this is that there is some space of context C (in this case some c \in C could e.g. be the applied forces), and a parameter / search space X. There is some context-dependent black box function f(x; x) -> y to be maximized, where x \in X and c \in C. The goal is to identify some policy \pi: c -> x that optimizes some merit functional h(\pi). E,g, if there is some probability distribution P over contexts, then a reasonable thing could be to maximize the expected outcome over the context distribution : h(\pi) = E[f(\pi(c);c) | c \sim P].

Of course ideally we'd be able to identify \pi* that yields x^*(c) = argmax_{x \in X} f(x,c) for all c \in C but that may not be possible (or necessary).

@sgbaird, before going further, is this the setting we're in here?

sgbaird commented 2 years ago

@Balandat thank you for clarifying. just to make sure, I read f(x; x) -> y as f(x; c) -> y instead (If I have this flipped, please let me know). The terminology "policy" is vaguely familiar, a quick search pulls up:

A policy is a state-action mapping. A 'state' is a formalism used in AI that represents the state of the world, i.e. what the agent's idea of the world is. The action is, naturally, what action it should take in that state. A policy just maps states to actions. (source)

Let me know if I'm missing the mark here.

After reading your description, I realized I am probably mixing two notions, so I'll try to disentangle those. I believe the first one follows the mathematical formalism you set up and involves knowing the context prior to .gen() of the next experiment, while the second one only has the extra information available after .gen() and after the experiment has begun.

Context is known before next experiment suggestion

In the first case, let's say that a company has several types of polymers that are used for 3D printing a part. polymer_A, polymer_B, ... polymer_G. They decide to send a polymer_A part to you (i.e. outside of your control), and ask you to come up with the best set of parameters to post-process the part (e.g. cure time, polishing procedure, all tunable parameters within your control) and of course where you have objective(s) that you measure. However, you have existing data for many types of polymers, not just polymer_A. So you can either ignore the data you've collected for all the other polymers (less desired - it's expected that there's significant useful information to be learned from the optimal post-processing conditions for other polymers), or you can try to predict the best set of post-processing conditions, but specific to parts made from polymer_A. This is a case that I think matches up with what you're describing, where polymer type is the scalar (sole) context (C) and post-processing parameters are the search space (X). Importantly, the context is known before you retrieve a suggestion for the next best experiment to run (and before you ever set foot in the lab to conduct the next experiment).

"Context" is only known/measured after starting the next experiment

By contrast, the other situation is when the extra features will not be known until after the experiment has begun. This would be pretty typical of characterization routines or measurements of potentially useful but not necessarily important objectives. For example, after synthesizing the next prediction, you look at it under a microscope, perform some spectroscopy, etc. All additional data is generated after the experiment has already been suggested, and of course, the objective(s) of interest get measured, too (hardness, strength, etc.). You want the algorithm to be able to learn from the additional data extracted from microscopy images, spectra, etc. (e.g. average crystal size extracted via image processing), but you don't know what you're going to get until after starting down the expensive road of synthesis. I'd expect this extra information to directly affect the model fit/accuracy, but it's not totally clear to me how this would affect the generation step.

General comments

Both cases are of interest to me, and I can think of quite a few examples for each in materials science tasks. The first has an immediate application for a script I'm trying to get ready, and the latter will be increasingly important in the next few weeks/months. I'm OK with putting off the second one for now, but I'd be interested to know if the second one seems feasible within Ax.

sgbaird commented 2 years ago

@Balandat there are a few projects that would benefit from implementing contextual learning, and these are increasing in priority for me. For the two interpretations I mentioned before, the first one is of immediate value to me, and most closely matches the mathematical problem you set up (assuming I understood correctly): suggesting the next trial under the assumption that one of the input variables has now been fixed (i.e. maximization of the acquisition function across a slice out of the full volume).

Any suggestions for how to implement this?

adam-bhaiji commented 2 years ago

Hi @sgbaird,

I have been following the various discussions around Ax using "state variables" and wanted to check if my use case is supported by Ax.

Say I am designing a race car and I want to find the optimal parameters for wing design (e.g. length, width, angle etc.). So I provide these parameters to Ax, with the objective function being maximising the speed of the car over some simulated race. However, there are also some features I think would be useful to aid it during the modelling process. For example: the current wind speed and the type of tyres on the car. Some features are stochastic and others are non-stochastic.

I have come across terms such as "FixedParameter" or "ObservationFeature", but I am not sure if this is applicable. As I understand, FixedParameter here would potentially be the type of tyres on the car. But for my stochastic feature (current wind speed), I don't think this is the use case since this feature is varying not just per trial, but also for each observation during a trial (especially if this is a multi-armed bandit, where under different wing designs suggested by the trial arms, they may be used on cars which have a mix of varying wind speed and tyres on the day... in my toy example at least).

In short, I am trying to answer the question:

Given a type of tyre, what is the best wing design for my car whilst considering contextual stochastic variables such as wind speed (which may vary depending on the type of tire being used)

Any help would be greatly appreciated, thanks.

sgbaird commented 2 years ago

So, I think the tire type would fit into the first category I mentioned, whereas the wind speed is something that fits into the second category. AFAIK, the first category can be dealt with by using the workaround I mentioned in https://github.com/facebook/Ax/issues/905#issuecomment-1096097249. The second category probably requires something more custom, but maybe @Balandat or someone else can comment here. I haven't settled on an idea for how to deal with a context that is only known after starting the experiment. I wonder if this would involve making a list of predefined candidates and treating the stochastic wind speed as a feature in a feature vector https://github.com/facebook/Ax/issues/771, and add that information before filtering and suggesting your next candidate. Maybe there's a way to do this without the somewhat hacky/non-robust workaround of using a predefined list. I'd also be worried about using predefined candidates if your task is high-dimensional. I don't see anything blocking this from a theoretical standpoint - your model has a variable for wind speed, the wind speeds in the past are fixed and known, and on the "day of the race", you can measure the wind speed and make decisions about the best car design based on that information.

With your task, is it the case that you could effectively "design a new car on the day of the race" based on the wind conditions for that day? Or is it just that you could choose a particular set of tires to put on, but everything else about the car design is fixed? (asking since you mentioned this is a toy problem and to make sure I understand the non-toy problem nuances).

I wonder also if it would make sense to try use a risk-averse BO method. Maybe one of these two would be useful:

I think the 2nd one is new (just came across it) and it seems very relevant to the toy problem you described.

saitcakmak commented 2 years ago

Hi @adam-bhaiji. Your setup reminds me of what we refer to as "environmental variables" in robust optimization. These are the variables that the decision maker does not control (e.g., the wind speed) but they have an effect on the metrics. In robust optimization, we try to find designs that are robust to these variables, given their distribution. This setup makes most sense when the final value of the environmental variable isn't observed until after the decision is made, which is not exactly your setup. You seem to be making the final decision once you observe the wind speed. Anyway, if the robust optimization setup is of interest to you, I have an internal implementation of this that I'd be happy to clean up.

I don't know how much of this is exposed in Ax, but @qingfeng10 has done a bunch of work on contextual BO that is implemented in BoTorch. Flagging that since it might be useful here.

I haven't settled on an idea for how to deal with a context that is only known after starting the experiment.

Let's call x_d the decision variables and x_c the context variables that are observed after starting the experiment. Since you can't control x_c, you can't optimize it while generating candidates. But since you get it along with the metric observations, you can include it in the surrogate model. So, your model training data would be (x_d, x_c) and the corresponding metrics. This is similar to the setup used in robust BO with environmental variables, though there we typically assume x_c can be controlled in the simulator (which is not strictly necessary). The crucial point here is that you need to be using an acquisition function that knows of this setup, so a bit of customization may be required (the robust BO implementation I referred to above may be helpful).

I think the 2nd one is new

Pretty sure I implemented them together :). The implementation I have exposes that formulation in Ax modular BoTorch model setup, building on the RobustSearchSpace and all the relevant bits that were introduced in past few months.

adam-bhaiji commented 2 years ago

Hi @sgbaird and @saitcakmak,

Thank you for the detailed responses, these are incredibly useful in understanding the ways Ax can be adapted! I have only just started familiarising myself with the library and implementing features by reading the docs/source code.

In response to your question @sgbaird, yes in this toy problem you can design a new car on the day of the race and say:

I want to race with soft tyres, the current measured wind speed is 10 mph, what parameters (length, width, angle) should I be using for my wing?

The other issues you linked to look well poised to use ObservationFeature and generate parameters based on the tyres being used; I will keep you posted on how that goes.

Of the two Botorch links you provided, the environmental variables section looks especially promising. @saitcakmak I think your explanation fits this problem well and it would be great if you could share your implementation of a robust optimisation setup. From my understanding, there is a robust optimisation setup with a custom Torch model; in your experience is there a practical limit to the number of environmental variables the BO can handle? I guess at some point you could compress your N environmental variables into some kind of lower dimensional space.

Additionally, I understand from the docs that Ax supports such custom models, but from the docs it is not overly clear to me where you would provide x_c data (I'm thinking it is in the CustomMetric fetch_trial_data() method). Hopefully your implementation could shed some light on this also :)

lena-kashtelyan commented 2 years ago

Closing this issue as inactive; @adam-bhaiji and @sgbaird, please reopen it if you do follow up!

sgbaird commented 2 years ago

@adam-bhaiji any luck with ObservationFeature?

sgbaird commented 2 years ago

For reference, using attach_trial with a custom parameterization doesn't work when there are fixed categorical features:

parameters = [
    ...
    {'name': 'source', 'type': 'fixed', 'value': 'name2'}
    ...
]
...
_, idx = client.attach_trial(params) # params is a dictionary. One of the key-value pairs is "source": "name1"

File c:\Users\sterg\Miniconda3\envs\...\lib\site-packages\ax\service\ax_client.py:789, in AxClient.attach_trial(self, parameters, ttl_seconds)
    775 def attach_trial(
    776     self, parameters: TParameterization, ttl_seconds: Optional[int] = None
    777 ) -> Tuple[TParameterization, int]:
    778     """Attach a new trial with the given parameterization to the experiment.
    779 
    780     Args:
   (...)
    787         Tuple of parameterization and trial index from newly created trial.
    788     """
--> 789     self._validate_search_space_membership(parameters=parameters)
    791     # If search space is hierarchical, we need to store dummy values of parameters
    792     # that are not in the arm (but are in flattened search space), as metadata,
    793     # so later we are able to make the data for this arm "complete" in the
    794     # flattened search space.
    795     candidate_metadata = None

File c:\Users\sterg\Miniconda3\envs\...\lib\site-packages\ax\service\ax_client.py:1734, in AxClient._validate_search_space_membership(self, parameters)
   1733 def _validate_search_space_membership(self, parameters: TParameterization) -> None:
-> 1734     self.experiment.search_space.check_membership(
   1735         parameterization=parameters, raise_error=True
   1736     )
   1737     # `check_membership` uses int and float interchangeably, which we don't
   1738     # want here.
   1739     for p_name, parameter in self.experiment.search_space.parameters.items():

File c:\Users\sterg\Miniconda3\envs\...\lib\site-packages\ax\core\search_space.py:223, in SearchSpace.check_membership(self, parameterization, raise_error, check_all_parameters_present)
    221     if not self.parameters[name].validate(value):
    222         if raise_error:
--> 223             raise ValueError(
    224                 f"{value} is not a valid value for "
    225                 f"parameter {self.parameters[name]}"
    226             )
    227         return False
    229 # parameter constraints only accept numeric parameters

ValueError: "name1" is not a valid value for parameter FixedParameter(name='source', parameter_type=STRING, value="name2")

EDIT: I think I need to allow the original search space to accommodate all values, then generate the trial directly using the snippet from https://github.com/facebook/Ax/issues/746#issue-1073276364.