Closed dczaretsky closed 2 years ago
Thanks for the question, @dczaretsky! Yes, we can add known datapoints to an experiment before generating new sobol trials. It would look something like:
start_params = { "a":4.5e-07, "b":2.3e-10} #...
start_data = Data(df=pd.DataFrame.from_records(
[
{
"arm_name": "0_0",
"metric_name": "metric0",
"mean": 3,
"sem": 0,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric1",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
# ...
]
))
exp = build_experiment() # from your code
trial = exp.new_trial()
trial.add_arm(Arm(parameters=start_params, name="0_0"))
exp.attach_data(start_data)
trial.run().complete()
# the above can be repeated for more datapoints, just don't reuse the arm name "0_0"
# now we can add sobol trials on top of this one
Hope that helps and let me know if I missed anything.
Thanks @danielcohenlive for the response. However, can you please provide some more explanation of what the field parameters for Arm datasets below are, and how do these correlate with my 15 input data points? How do I assign the value for each of my 15 input variables with integer or floating point values.
{
"arm_name": "0_0",
"metric_name": "metric0",
"mean": 3,
"sem": 0,
"trial_index": 0, # this matches the index the new trial will have
},
Also, for subsequent trials, do I just increment the trial_index, or do I need to adjust the arm_name as well?
Good questions. I think what you're referring to as inputs are start_params
here and the outputs are start_data
. Basically, experiment consists of trials with arms that have parameter values set (input), and then data, which has the resultant metric values for each arms (output). In the search_space
you set up parameters, so you have something like:
SearchSpace(
paraters=[
RangeParameter(
name="a",
parameter_type=ParameterType.FLOAT,
...
),
RangeParameter(
name="b",
parameter_type=ParameterType.FLOAT,
...
),
]
)
Then when you define an Arm
, it takes a parameters argument with keys equal to the names of the search space parameters. It my example, I reduced the problem to 2 inputs/parameters and 2 outputs/metrics. In full it would look like
start_params = { "a":4.5e-07, "b":2.3e-10, "c":2e-10, "d":2e-10, "e":1e-12, "f":0.7, "g":15.0, "h":15.0, "i":3.0e-11, "j":1e-12, "k":7.4, "l":10, "m":10, "n":20, "o":46}
start_data = Data(df=pd.DataFrame.from_records(
[
{
"arm_name": "0_0",
"metric_name": "metric0",
"mean": 3,
"sem": 0,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric1",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric2",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric3",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric4",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric5",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric6",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric7",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric8",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
{
"arm_name": "0_0",
"metric_name": "metric9",
"mean": 2,
"sem": 0.15,
"trial_index": 0, # this matches the index the new trial will have
},
]
))
exp = build_experiment() # from your code
trial = exp.new_trial()
trial.add_arm(Arm(parameters=start_params, name="0_0"))
exp.attach_data(start_data)
trial.run().complete()
# the above can be repeated for more datapoints, just don't reuse the arm name "0_0"
# now we can add sobol trials on top of this one
Also, for subsequent trials, do I just increment the trial_index, or do I need to adjust the arm_name as well?
I think you'd need a new arm name for the new trial. Our convention for arm names is f"{trial_number}_{arm_number}"
, but it isn't required to use our convention.
If you wanted to add multiple trials, it could look something like:
trials = [{
"input": {"a": 1.0, "b": 0.1..., "o": 7},
"output": {"m0": {"mean" 0.1, "sem": 0}, ... "m9": {"mean" 0.1, "sem": 0}},
},
{
"input": {"a": 1.0, "b": 0.1..., "o": 7.2},
"output": {"m0": {"mean" 0.1, "sem": 0}, ... "m9": {"mean" 0.1, "sem": 0}},
}]
exp = build_experiment() # from your code
for i, trial in enumerate(trials):
arm_name = f"{i}_0"
trial = exp.new_trial()
trial.add_arm(Arm(parameters=trial["input"], name=arm_name))
data = Data(df=pd.DataFrame.from_records([
{
"arm_name": arm_name,
"metric_name": metric_name,
"mean": output["mean"],
"sem": output["sem"],
"trial_index": i,
}
for metric_name, output in trial["output"].items()
])
)
exp.attach_data(start_data)
trial.run().complete()
# now we can add sobol trials on top
You could alternatively use the service api and it would look more like:
client = AxClient()
client.create_experiment(....)
# create new trial with input/params
client.attach_trial(parameters={ "a":4.5e-07, "b":2.3e-10, "c":2e-10, "d":2e-10, "e":1e-12, "f":0.7, "g":15.0, "h":15.0, "i":3.0e-11, "j":1e-12, "k":7.4, "l":10, "m":10, "n":20, "o":46})
# complete trial 0 with `raw_data` as output/data
# in the tuple `(1.7, 0)`, 17 is the mean value and 0 is the sem (uncertainty)
client.complete_trial(trial_index=0, raw_data={"m0": (1.7, 0), "m1": (0.6, 0)..., "m9": (0.9, 0)})
Hi @danielcohenlive, thanks for the information. I'm confirming that the above worked. It does add that initial trial to the experiment.
However, the behavior is not what I expected. I assumed that if you provide a set of data points that already satisfy the output constraints, the MOO will choose this over random data points that do not satisfy any goals. But that's not what is happening, in fact the opposite. It seems to ignore the initial trial almost entirely and the batch iterations select other random points nowhere near the solution. My understanding is that it should select the best data set and then further optimize over the set of batch iterations (e.g. further minimize the constraints).
Am I missing something here? What I'm hoping to accomplish here is provide close-to-optimal solution, and let the MOO carry it over the finish line, much quicker than searching the entire design space using psuedo-random permutations.
I appreciate any further insight you can share. Thanks!
Also wondering if this message has anything to do with it. When I remove the Sobol data points it seems to crash stating that the data set is empty. The first message below suggests that the arm data set was not included.
Batch iteration 0 [INFO 01-07 18:30:43] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
Traceback (most recent call last):
File "/multiobjective.py", line 549, in
There is an important warning here: Leaving out out-of-design observations for arms: 0_0
. Ax models only use arms that are legal for the experiments serach space. Basically, the input (params you set in the arm of the first trial) is not valid for the search space you are using. For example, you have "g":15.0
, but maybe in the search space "g" is supposed to be an integer between 0 and 10, and the model won't use that data point. That's what it means by "out-of-design". I believe this is the problem.
If fixing that doesn't solve it, can you post the code where your search space is defined along with the code that does trial generation and iteration (with any sensitive info redacted)?
Hope that helps!
One thing we should consider doing is not filtering those observations out by default. There are situations in which it might be very reasonable to have observations outside the search space - maybe there is a bunch of data that has been collected, but the experimenter has the domain knowledge to restrict the search space to something smaller than the full support of the data. Including observations beyond the boundaries of the search space should help improve the model, even if these observations are deemed infeasible and would thus not be suggested by the generation strategy.
With the developer API we have the ability to do this - essentially have a large search space for the modeling, and then pass a new search space to gen
. However, I am not sure if this is possible for the service API (or whether it should be...). Curious what @lena-kashtelyan's thoughts are on this.
I concur with @Balandat. We have a wide range of data for each input variable, but generally each project would have a more narrower window, depending on the specifications. Therefore, it might be reasonable to start from a nearby solution.
Furthermore, I'm still struggling with this a little bit. As I mentioned above, I may have say 9/10 outputs that satisfy the objectives. Would the arm be filtered on only inputs out of range or also outputs out of range? I ask because even after modifying the ranges for inputs, the arm is still filtered out. If the arm is filtered by outputs that would definitely be a roadblock. Although as I mentioned earlier, it may be critical to allow some inputs outside the range as a starting point.
I don't believe that we do any filtering on the range of the output. We have some winsorization transforms that winsorize the data, but they wouldn't remove it (and I don't think they are even enabled by default).
One thing we should consider doing is not filtering those observations out by default. There are situations in which it might be very reasonable to have observations outside the search space - maybe there is a bunch of data that has been collected, but the experimenter has the domain knowledge to restrict the search space to something smaller than the full support of the data. Including observations beyond the boundaries of the search space should help improve the model, even if these observations are deemed infeasible and would thus not be suggested by the generation strategy.
This is a problem I ran into a while back (using the Service API) and my workaround was to loosen the constraints on the parameters. At the time, it seemed better than throwing out the data. A hundred or so experimental data points had been collected and the search space was later restricted.
Another, very related application that I'm also now running into is that you don't know much about lots of parameters initially (e.g. 20+ hyperparameters), you do an initial search across all hyperparameters, prune the hyperparameters down to the top 5 or 10 hyperparameters (or hyperparameters with feature importances above some threshold) while keeping less relevant ones fixed at optimal values, and then continue the hyperparameter search in the restricted subspace. While the second search could be run without the data from the first search, as you mentioned it would likely benefit from the data from the first search. Something neat about this is it could be closed-loop.
@danielcohenlive thanks for the tip. I was able to identify the culprit. In my model, I have parameter constraint included in the search space, but that value was left out of the initial start_params for the arm trial.
One side point, it took a bit of hacking of the back end to locate the source of the error. I'm wondering if there might be an easier way, by allowing one to pass in raise_error=True so that they can be more easily identified.
Once I modified the parameter constraint, the earlier message disappeared, but now I'm getting another error: "There are no feasible observed points." I can see that all 9 of my output constraints (or objectives) are satisfied here. This is coming from an internal function, infer_objective_thresholds. Any idea what might be causing this error?
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p1 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p2 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p3 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p4 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p5 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p6 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p7 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p8 is constant, within tolerance.
[INFO 01-08 16:25:09] ax.modelbridge.transforms.standardize_y: Outcome p9 is constant, within tolerance.
Traceback (most recent call last):
File "/home/dev/multi-objective.py", line 572, in
@dczaretsky is that the full stack trace? For the "There are no feasible observed points" error, I generally found that scrolling up in the error message revealed the actionable item (for example, there was an error during the evaluation of the objective function such as an indexing error).
@sgbaird that's the entire stack & log print out.
@dczaretsky are you using any Ax functions to validate that the observed points are feasible? Are you able to share the data so we can take a closer look?
One thing to note is that in this function to infer the objective thresholds the feasibility is computed on the points predicted by the model. So it could be that if the model does a good amount of smoothing / regularization that the observed measurements at the training points are feasible, while the model predicted ones are not - this usually shouldn't happen if the model fit is good, but it could explain this error here.
One thing we should consider doing is not filtering those observations out by default. There are situations in which it might be very reasonable to have observations outside the search space - maybe there is a bunch of data that has been collected, but the experimenter has the domain knowledge to restrict the search space to something smaller than the full support of the data. Including observations beyond the boundaries of the search space should help improve the model, even if these observations are deemed infeasible and would thus not be suggested by the generation strategy.
I think we should be able to do this through the Service API as well, by specifying a custom generation strategy with model_kwargs={"fit_out_of_design": True}
(which would configure this setting) for the BayesOpt step; @Balandat, does that sound right to you?
If that is right, making a custom generation strategy would look like this (this is copied from generation strategy tutorial and adjusted for multi-objective optimization (MOO)):
from ax.modelbridge.generation_strategy import GenerationStrategy, GenerationStep
from ax.modelbridge.registry import Models
gs = GenerationStrategy(
steps=[
# 1. Initialization step (does not require pre-existing data and is well-suited for
# initial sampling of the search space)
GenerationStep(
model=Models.SOBOL,
num_trials=5, # How many trials should be produced from this generation step
min_trials_observed=3, # How many trials need to be completed to move to next model
),
# 2. Bayesian optimization step (requires data obtained from previous phase and learns
# from all data available at the time of each new candidate generation call)
GenerationStep(
model=Models.MOO,
num_trials=-1, # No limitation on how many trials should be produced from this step
model_kwargs={"fit_out_of_design": True},
max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol
# More on parallelism vs. required samples in BayesOpt:
# https://ax.dev/docs/bayesopt.html#tradeoff-between-parallelism-and-total-number-of-trials
),
]
)
ax_client=AxClient(generation_strategy=gs)
@lena-kashtelyan that's great. Just to double check: The service API does not allow to (explicitly) use different search spaces in different steps, right? In that case would the advice just to be to create a new client with the new search space, report the previously gathered data (using the "fit_out_of_design": True
specification), and continue from there?
@Balandat,
The service API does not allow to (explicitly) use different search spaces in different steps, right?
Right now it does not, no. We are actually considering adding ax_client.get_next_trial(**kwargs)
(with kwargs getting passed to generation_strategy.gen
), so that would let us pass the new search space –– we can discuss benefits / drawbacks of that approach internally, it mostly came to mind as a solution to https://github.com/facebook/Ax/issues/746. In the meantime, I believe this is exactly the way to go:
would the advice just to be to create a new client with the new search space, report the previously gathered data (using the "fit_out_of_design": True specification), and continue from there?
@dczaretsky are you using any Ax functions to validate that the observed points are feasible? Are you able to share the data so we can take a closer look?
One thing to note is that in this function to infer the objective thresholds the feasibility is computed on the points predicted by the model. So it could be that if the model does a good amount of smoothing / regularization that the observed measurements at the training points are feasible, while the model predicted ones are not - this usually shouldn't happen if the model fit is good, but it could explain this error here.
@Balandat seems the error was due to the fact I did not include objective thresholds for the MOO variables. After adding some base line thresholds, the error seems to go away. I assume the error means it was attempting to infer thresholds if not supplied directly?
@lena-kashtelyan any idea when support for the "fit_out_of_design" feature might be available to test? Also will it be compatible with the way I have implemented the initial data points as outlined by @danielcohenlive above, or will we need to migrate to the GenerationStrategy implementation?
I assume the error means it was attempting to infer thresholds if not supplied directly?
Correct, we use a heuristic to infer the thresholds if not provided. But this shouldn't error out, this suggests that there seems to be an issue with that logic. Maybe @sdaulton who wrote this has some thoughts what could be going wrong?
@lena-kashtelyan any idea when support for the "fit_out_of_design" feature might be available to test?
Oh it is already available, my example above should work on current Ax stable. Let me know if it doesn't, @dczaretsky!
Also will it be compatible with the way I have implemented the initial data points as outlined by @danielcohenlive above, or will we need to migrate to the GenerationStrategy implementation?
It should be compatible; if you use Service API, you can specify the generation strategy as given in my example, and then attach_trial
and complete_trial
for any initial data points.
Once you get these suggestions working (using custom generation strategy with fit_out_of_design=True
and adding initial points through Service API), let's see if you are still running into the issue with the inference of the objective thresholds; I wonder if somehow that function was just getting empty data (although doubtful based on your repro).
Hi, i am having a very similar use case as described here, and am also using the steps recommend here (i.e. how to add initial data points). Interestingly, I am getting the 'There are no feasible observed points.' error although I am manually specifying thresholds for my objectives. More precisely, I am seeing that the thresholds are apparently inferred (or at least it is tried) although I specified them (Service API, using the objectives argument and specifying the thresholds for my objectives). Any idea on why this might happen? I can also provide a code snippet if required, just thought maybe there is an obvious explanation to this.
Thanks in advance!
A code snipped & stack trace would be helpful - If you specify the thresholds manually, we should not try to infer them automatically, seems like there could be a bug somewhere.
+1 to @Balandat on this –– @mcd01, please open a new issue with the code snippet and stacktrace of the error you are getting, so we can look into this further. Thank you!
No problem, I did so, find it here: #802
Fantastic, will take a look shortly @mcd01!
Also @dczaretsky let us know how the suggestions above worked out for you and whether this issue is resolved with those!
Hi @lena-kashtelyan, I'm still having issues with the model per your suggestion. Unfortunately the documentation is a bit skim and there's not really much to go in terms of understanding how the Service API works. Here is my implementation:
# Generation strategy to include pseudo random and known data points
gs = GenerationStrategy(
steps=[
# 1. Initialization step (does not require pre-existing data and is well-suited for
# initial sampling of the search space)
GenerationStep(
model=Models.SOBOL,
num_trials=5, # How many trials should be produced from this generation step
min_trials_observed=3, # How many trials need to be completed to move to next model
),
# 2. Bayesian optimization step (requires data obtained from previous phase and learns
# from all data available at the time of each new candidate generation call)
GenerationStep(
model=Models.MOO,
num_trials=-1, # No limitation on how many trials should be produced from this step
model_kwargs={"fit_out_of_design": True},
max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol
# More on parallelism vs. required samples in BayesOpt:
# https://ax.dev/docs/bayesopt.html#tradeoff-between-parallelism-and-total-number-of-trials
),
]
)
ax_client = AxClient(generation_strategy=gs)
ax_client.create_experiment(
name="ax_experiment",
parameters=[
{"name":"x0", "type":"range", "value_type":"float", "bounds":[5.0, 13.0]},
{"name":"x1", "type":"range", "value_type":"float", "bounds":[50.0e-12, 250.0e-12]},
{"name":"x2", "type":"range", "value_type":"float", "bounds":[300.0e-9, 600.0e-9]},
{"name":"x3", "type":"range", "value_type":"float", "bounds":[300.0e-12, 700.0e-12]},
{"name":"x4", "type":"range", "value_type":"float", "bounds":[150.0e-12, 600.0e-12]},
{"name":"x5", "type":"range", "value_type":"float", "bounds":[1.0e-12, 30.0e-12]},
{"name":"x6", "type":"range", "value_type":"float", "bounds":[1.0e-12, 60.0e-12]},
{"name":"x7", "type":"range", "value_type":"float", "bounds":[1.0e-12, 20.0e-12]},
{"name":"x8", "type":"range", "value_type":"float", "bounds":[1.0e-12, 20.0e-12]},
{"name":"x9", "type":"range", "value_type":"float", "bounds":[0.3, 2.0]},
{"name":"x10", "type":"range", "value_type":"float", "bounds":[1.0, 10.0]},
{"name":"x11", "type":"range", "value_type":"float", "bounds":[5.0, 150.0]},
{"name":"x12", "type":"range", "value_type":"float", "bounds":[1.0, 25.0]},
{"name":"x13", "type":"range", "value_type":"float", "bounds":[10.0e-12, 60.0e-12]},
{"name":"x14", "type":"range", "value_type":"float", "bounds":[15.0e-9, 100.0e-9]},
{"name":"x15", "type":"range", "value_type":"float", "bounds":[0, 100]},
],
objectives={
# `threshold` arguments are optional
"y0": ObjectiveProperties(minimize=False),
"y1": ObjectiveProperties(minimize=False),
"y2": ObjectiveProperties(minimize=True),
"y3": ObjectiveProperties(minimize=True),
},
overwrite_existing_experiment=True,
is_test=False,
)
# get starting points
print("\nInitializing trials...\n")
init_data_points = [
{"x2": 4.5141866178851045e-07,"x3": 2.3249939526831138e-10,"x1": 2e-10,"x4": 2e-10,"x5": 1e-12,"x9": 0.7374054903263668,"x10": 15.0,"x12": 15.0,"x6": 3.0684166087952084e-11,"x13": 1e-12,"x0": 7.4,"x7": 10e-12,"x8": 10e-12,"x11": 20,"x14": 46e-9, "x15": 20.1334},
]
for d in init_data_points:
parameters, trial_index = ax_client.attach_trial(d)
ax_client.complete_trial(trial_index=trial_index, raw_data=evaluate(parameters))
# ### Run Optimization
print("\nRunning trials...\n")
for i in range(25):
parameters, trial_index = ax_client.get_next_trial()
# Local evaluation here can be replaced with deployment to external system.
ax_client.complete_trial(trial_index=trial_index, raw_data=evaluate(parameters))
I'm seeing the following error:
parameters, trial_index = ax_client.attach_trial(d)
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 664, in attach_trial
self._validate_search_space_membership(parameters=parameters)
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 1481, in _validate_search_space_membership
self.experiment.search_space.check_membership(
File "/python/env/lib/python3.9/site-packages/ax/core/search_space.py", line 209, in check_membership
raise ValueError(
ValueError: 2.3249939526831138e-10 is not a valid value for parameter RangeParameter(name='x3', parameter_type=FLOAT, range=[3e-10, 7e-10])
From what I can tell, the initial trial value for X3 is outside the specified range for that parameter, which is causing the error. However, I understood that model_kwargs={"fit_out_of_design": True} would allow such values for the initial trials. Am I missing something here?
Hi @dczaretsky! I think there are two issues that are getting a bit conflated here: 1) model fitting to out-of-design points and 2) actually being able to attach trials to experiment (before ever using the model, as your error comes from the call to attach_trial
). You are running into issue #2, because in the Service API, we actually check that the trial is a part of the search space when we attach it, so that validation is blocking you from adding an out-of-design trial (as the value of 2.3249939526831138e-10 is outside the interval of [3e-10, 7e-10]).
In general, the Service API is a somewhat simplified interface that does not cater to the presence of points being out-of-design (although we could change that, we will consider it). In the meantime, the workaround for you would be to manually add the trials you want if you really want to include out-of-design points. The way you would do that is, in place of each ax_client.attach_trial
call, to do the following:
trial = ax_client.experiment.new_trial()
trial.add_arm(Arm(parameters={"x1":..., "x2":..., ...}))
trial.mark_running(no_runner_required=True)
then, to add data for that trial:
ax_client.complete_trial(trial_index=trial.index, raw_data=evaluate(...))
@lena-kashtelyan thanks for the follow up. I've made the changes, but now getting a different error:
trial = ax_client.new_trial()
AttributeError: 'AxClient' object has no attribute 'new_trial'
Should it be ax_client.experiment.new_trial() ?
But I'm still getting errors here. it seem parameters is not part of add_arm(). Can you advise?
trial.add_arm(parameters=d)
File "/python/env/lib/python3.9/site-packages/ax/core/base_trial.py", line 165, in _immutable_once_run
return func(self, *args, **kwargs)
TypeError: add_arm() got an unexpected keyword argument 'parameters'
Yep I had just edited the comment above to do ax_client.experiment.new_trial
. And for add_arm
, correct line is (also edited above, Arm
can be imported from ax.core.arm
):
trial.add_arm(Arm(parameters={"x1":..., "x2":..., ...}))
Thanks. The changes work, but as soon as the arm and initial trials complete, the program crashes with the error "no feasible observed points". Does this seems to imply that the trial data was completely filtered out?
[INFO 02-11 20:54:20] ax.service.ax_client: Completed trial 5 with data: {'x0': (0.203132, 0.0), 'x1': (-0.179923, 0.0), 'x2': (0.185545, 0.0), 'x3': (0.045817, 0.0), 'x4': (0.020111, 0.0), 'x5': (33.6229, 0.0), 'x6': (2.7211, 0.0), 'x7': (8.89977, 0.0), 'x8': (666667.0, 0.0), 'x9': (12.124767, 0.0)}.
Traceback (most recent call last):
File "/moo.py", line 538, in <module>
parameters, trial_index = ax_client.get_next_trial()
File "/python/env/lib/python3.9/site-packages/ax/utils/common/executils.py", line 147, in actual_wrapper
return func(*args, **kwargs)
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 355, in get_next_trial
generator_run=self._gen_new_generator_run(), ttl_seconds=ttl_seconds
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 1344, in _gen_new_generator_run
return not_none(self.generation_strategy).gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 330, in gen
return self._gen_multiple(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 471, in _gen_multiple
generator_run = _gen_from_generation_step(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 833, in _gen_from_generation_step
generator_run = generation_step.gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_node.py", line 111, in gen
return model_spec.gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/model_spec.py", line 170, in gen
return fitted_model.gen(**model_gen_kwargs)
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/multi_objective_torch.py", line 231, in gen
return super().gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/base.py", line 674, in gen
observation_features, weights, best_obsf, gen_metadata = self._gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/array.py", line 276, in _gen
X, w, gen_metadata, candidate_metadata = self._model_gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/multi_objective_torch.py", line 165, in _model_gen
X, w, gen_metadata, candidate_metadata = self.model.gen(
File "/python/env/lib/python3.9/site-packages/ax/models/torch/botorch_moo.py", line 296, in gen
full_objective_thresholds = infer_objective_thresholds(
File "/python/env/lib/python3.9/site-packages/ax/models/torch/botorch_moo_defaults.py", line 514, in infer_objective_thresholds
raise AxError("There are no feasible observed points.")
ax.exceptions.core.AxError: There are no feasible observed points.
Interesting; could I see what trials were generated and what data you completed them with?
@dczaretsky and @lena-kashtelyan this doesn't necessarily explain unexpected behavior, but I did something similar with the service API (thanks to much help over the last several months by Lena and Max) at one point. The following example should be runnable as is on the latest branch. The basic strategy was to initialize two (nearly identical) ax_client
objects with immutable_search_space_and_opt_config=False
in create_experiment
and different bounds on the parameters for each (and of course, fit_out_of_design: True
in the generation strategy).
"""Bayesian optimization using sequential search spaces."""
# %% imports
import numpy as np
import pandas as pd
from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models
from ax.models.torch.botorch_modular.surrogate import Surrogate
from ax.service.ax_client import AxClient
from botorch.acquisition.monte_carlo import qNoisyExpectedImprovement
from botorch.models.gp_regression import SingleTaskGP
batch_size = 1
unique_components = ["Al", "Co", "Cr", "Cu", "Fe", "Ni"]
compositions = np.array(
[
[18.2, 9.1, 18.2, 18.2, 18.2, 18.2],
[18.2, 18.2, 9.1, 18.2, 18.2, 18.2],
[18.2, 18.2, 18.2, 18.2, 9.1, 18.2],
[18.2, 18.2, 18.2, 18.2, 18.2, 9.1],
[5.3, 21.1, 21.1, 0, 26.3, 26.3],
[12.5, 12.5, 12.5, 0, 12.5, 50],
],
)
X_train = pd.DataFrame(compositions, columns=unique_components)
# normalize https://stackoverflow.com/a/35679163/13697228
X_train = X_train.div(X_train.sum(axis=1), axis=0)
X_train = X_train.iloc[:, :-1] # drop "Ni"
unique_components = unique_components[:-1]
np.random.seed(10)
y_train = 100 * np.random.rand(X_train.shape[0])
exp_name = "dummy_experiment"
target_name = "dummy"
n_components = X_train.shape[1]
n_train = X_train.shape[0]
orig_max_val = 1.0
max_val = 0.196
orig_parameters = [
{"name": component, "type": "range", "bounds": [0.0, orig_max_val]}
for component in unique_components[:-1]
]
parameters = [
{"name": component, "type": "range", "bounds": [0.0, max_val]}
for component in unique_components[:-1]
]
separator = " + "
orig_comp_constraint = separator.join(unique_components[:-1]) + f" <= {orig_max_val}"
composition_constraint = separator.join(unique_components[:-1]) + f" <= {max_val}"
# %% optimize
gs = GenerationStrategy(
steps=[
GenerationStep(
model=Models.BOTORCH_MODULAR,
num_trials=-1, # No limitation on how many trials should be produced from this step
max_parallelism=batch_size, # Parallelism limit for this step, often lower than for Sobol
# More on parallelism vs. required samples in BayesOpt:
# https://ax.dev/docs/bayesopt.html#tradeoff-between-parallelism-and-total-number-of-trials
model_kwargs={
# https://github.com/facebook/Ax/issues/768#issuecomment-1009007526
"fit_out_of_design": True,
"surrogate": Surrogate(SingleTaskGP),
"botorch_acqf_class": qNoisyExpectedImprovement,
},
),
]
)
ax_client = AxClient(generation_strategy=gs)
ax_client.create_experiment(
name=exp_name,
parameters=orig_parameters,
parameter_constraints=[orig_comp_constraint],
objective_name=target_name,
minimize=True,
immutable_search_space_and_opt_config=False,
)
ax_client_tmp = AxClient(generation_strategy=gs)
ax_client_tmp.create_experiment(
name=exp_name,
parameters=parameters,
parameter_constraints=[composition_constraint],
objective_name=target_name,
minimize=True,
immutable_search_space_and_opt_config=False,
)
ct = 0
for i in range(n_train):
ax_client.attach_trial(X_train.iloc[i, :-1].to_dict())
ax_client.complete_trial(trial_index=ct, raw_data=y_train[i])
ct = ct + 1
# narrow the search space
ax_client.experiment.search_space = ax_client_tmp.experiment.search_space
for _ in range(15):
parameters, trial_index = ax_client.get_next_trial()
ax_client.complete_trial(trial_index=trial_index, raw_data=np.random.rand()
best_parameters, metrics = ax_client.get_best_parameters()
@lena-kashtelyan Here is my current implementation. There are some additional new objectives and outcome constraints added to the model.
# Generation strategy to include pseudo random and known data points
gs = GenerationStrategy(
steps=[
# 1. Initialization step (does not require pre-existing data and is well-suited for
# initial sampling of the search space)
GenerationStep(
model=Models.SOBOL,
num_trials=5, # How many trials should be produced from this generation step
min_trials_observed=3, # How many trials need to be completed to move to next model
),
# 2. Bayesian optimization step (requires data obtained from previous phase and learns
# from all data available at the time of each new candidate generation call)
GenerationStep(
model=Models.MOO,
num_trials=-1, # No limitation on how many trials should be produced from this step
model_kwargs={"fit_out_of_design": True},
max_parallelism=3, # Parallelism limit for this step, often lower than for Sobol
# More on parallelism vs. required samples in BayesOpt:
# https://ax.dev/docs/bayesopt.html#tradeoff-between-parallelism-and-total-number-of-trials
),
]
)
ax_client = AxClient(generation_strategy=gs)
ax_client.create_experiment(
name="simulations",
parameters=[
{"name":"x0", "type":"range", "value_type":"float", "bounds":[5.0, 13.0]},
{"name":"x1", "type":"range", "value_type":"float", "bounds":[50.0e-12, 250.0e-12]},
{"name":"x2", "type":"range", "value_type":"float", "bounds":[300.0e-9, 600.0e-9]},
{"name":"x3", "type":"range", "value_type":"float", "bounds":[300.0e-12, 700.0e-12]},
{"name":"x4", "type":"range", "value_type":"float", "bounds":[150.0e-12, 600.0e-12]},
{"name":"x5", "type":"range", "value_type":"float", "bounds":[1.0e-12, 30.0e-12]},
{"name":"x6", "type":"range", "value_type":"float", "bounds":[1.0e-12, 60.0e-12]},
{"name":"x7", "type":"range", "value_type":"float", "bounds":[1.0e-12, 20.0e-12]},
{"name":"x8", "type":"range", "value_type":"float", "bounds":[1.0e-12, 20.0e-12]},
{"name":"x9", "type":"range", "value_type":"float", "bounds":[0.3, 2.0]},
{"name":"x10", "type":"range", "value_type":"float", "bounds":[1.0, 10.0]},
{"name":"x11", "type":"range", "value_type":"float", "bounds":[5.0, 150.0]},
{"name":"x12", "type":"range", "value_type":"float", "bounds":[1.0, 25.0]},
{"name":"x13", "type":"range", "value_type":"float", "bounds":[10.0e-12, 60.0e-12]},
{"name":"x14", "type":"range", "value_type":"float", "bounds":[15.0e-9, 100.0e-9]},
],
objectives={
# `threshold` arguments are optional
"y1": ObjectiveProperties(minimize=False),
"y4": ObjectiveProperties(minimize=False),
"y5": ObjectiveProperties(minimize=True),
"y9": ObjectiveProperties(minimize=True),
},
outcome_constraints=[
"y0 >= 0.8",
"y2 <= 180e-3",
"y3 >= 1.9", "y3 <= 2.1",
"y6 >= 6.0", "y6 <= 13.0",
"y7 >= 2.0", "y7 <= 5.0",
"y8 <= 300e6",
],
overwrite_existing_experiment=True,
is_test=False,
)
### Init with starting points
print("\nInitializing trials...\n")
init_data_points = [
{"x0": 7.4,"x1": 2e-10,"x2": 4.5141866178851045e-07,"x3": 2.3249939526831138e-10,"x4": 2e-10,"x5": 1e-12,"x6": 3.0684166087952084e-11,"x7": 10e-12,"x8": 10e-12,"x9": 0.7374054903263668,"x10": 15.0,"x11": 20,"x12": 15.0,"x13": 1e-12,"x14": 46e-9, "x15": 20.1334},
]
for d in init_data_points:
trial = ax_client.experiment.new_trial()
trial.add_arm(Arm(parameters=d))
trial.mark_running(no_runner_required=True)
ax_client.complete_trial(trial_index=trial.index, raw_data=evaluate(d))
# ### Run Optimization
print("\nRunning trials...\n")
for i in range(25):
parameters, trial_index = ax_client.get_next_trial()
# Local evaluation here can be replaced with deployment to external system.
ax_client.complete_trial(trial_index=trial_index, raw_data=evaluate(parameters))
ax_client.generation_strategy.trials_as_df
best_parameters, values = ax_client.get_best_parameters()
print(best_parameters)
Here is the output from the trials. The first iteration is the initial data points, followed by the trials. I can see that some of these trials are not producing data points in range, so feasibly would explain why they might be filtered (or ignored). But again, the ideas was that at the very least the initial data points should be counted as a starting point for optimizations.
[INFO 02-12 14:47:05] ax.service.ax_client: Starting optimization with verbose logging. To disable logging, set the `verbose_logging` argument to `False`. Note that float values in the logs are rounded to 6 decimal points.
[INFO 02-12 14:47:05] ax.service.utils.instantiation: Due to non-specification, we will use the heuristic for selecting objective thresholds.
[INFO 02-12 14:47:05] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='x0', parameter_type=FLOAT, range=[5.0, 13.0]), RangeParameter(name='x1', parameter_type=FLOAT, range=[5e-11, 2.5e-10]), RangeParameter(name='x2', parameter_type=FLOAT, range=[3e-07, 6e-07]), RangeParameter(name='x3', parameter_type=FLOAT, range=[3e-10, 7e-10]), RangeParameter(name='x4', parameter_type=FLOAT, range=[1.5e-10, 6e-10]), RangeParameter(name='x5', parameter_type=FLOAT, range=[1e-12, 3e-11]), RangeParameter(name='x6', parameter_type=FLOAT, range=[1e-12, 6e-11]), RangeParameter(name='x7', parameter_type=FLOAT, range=[1e-12, 2e-11]), RangeParameter(name='x8', parameter_type=FLOAT, range=[1e-12, 2e-11]), RangeParameter(name='x9', parameter_type=FLOAT, range=[0.3, 2.0]), RangeParameter(name='x10', parameter_type=FLOAT, range=[1.0, 10.0]), RangeParameter(name='x11', parameter_type=FLOAT, range=[5.0, 150.0]), RangeParameter(name='x12', parameter_type=FLOAT, range=[1.0, 25.0]), RangeParameter(name='x13', parameter_type=FLOAT, range=[1e-11, 6e-11]), RangeParameter(name='x14', parameter_type=FLOAT, range=[1.5e-08, 1e-07])], parameter_constraints=[]).
Initializing trials...
[INFO 02-12 14:47:05] ax.service.ax_client: Completed trial 0 with data: {'y0': (0.822588, 0.0), 'y1': (-0.444993, 0.0), 'y2': (0.149345, 0.0), 'y3': (1.90744, 0.0), 'y4': (0.565881, 0.0), 'y5': (20.1334, 0.0), 'y6': (6.45355, 0.0), 'y7': (1.34418, 0.0), 'y8': (666667.0, 0.0), 'y9': (7.4, 0.0)}.
Running trials...
[INFO 02-12 14:47:05] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
[INFO 02-12 14:47:05] ax.service.ax_client: Generated new trial 1 with parameters {'x0': 12.732989, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 0.984791, 'x10': 3.690676, 'x11': 35.497298, 'x12': 17.765703, 'x13': 0.0, 'x14': 0.0}.
[INFO 02-12 14:47:14] ax.service.ax_client: Completed trial 1 with data: {'y0': (0.838974, 0.0), 'y1': (-0.562291, 0.0), 'y2': (0.131702, 0.0), 'y3': (1.57757, 0.0), 'y4': (0.218401, 0.0), 'y5': (48.9528, 0.0), 'y6': (9.82202, 0.0), 'y7': (20.1924, 0.0), 'y8': (666667.0, 0.0), 'y9': (12.732989, 0.0)}.
[INFO 02-12 14:47:14] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
[INFO 02-12 14:47:14] ax.service.ax_client: Generated new trial 2 with parameters {'x0': 11.949355, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.385573, 'x10': 1.098591, 'x11': 80.580385, 'x12': 6.902206, 'x13': 0.0, 'x14': 0.0}.
[INFO 02-12 14:47:23] ax.service.ax_client: Completed trial 2 with data: {'y0': (0.582957, 0.0), 'y1': (-0.076583, 0.0), 'y2': (0.190691, 0.0), 'y3': (0.0584, 0.0), 'y4': (0.059063, 0.0), 'y5': (40.6737, 0.0), 'y6': (3.00603, 0.0), 'y7': (10.7314, 0.0), 'y8': (666667.0, 0.0), 'y9': (11.949355, 0.0)}.
[INFO 02-12 14:47:23] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
[INFO 02-12 14:47:23] ax.service.ax_client: Generated new trial 3 with parameters {'x0': 11.098331, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.78637, 'x10': 8.07087, 'x11': 69.313436, 'x12': 17.878978, 'x13': 0.0, 'x14': 0.0}.
[INFO 02-12 14:47:31] ax.service.ax_client: Completed trial 3 with data: {'y0': (0.6419, 0.0), 'y1': (-0.111441, 0.0), 'y2': (0.138599, 0.0), 'y3': (0.096002, 0.0), 'y4': (0.073666, 0.0), 'y5': (44.5724, 0.0), 'y6': (3.5555, 0.0), 'y7': (5.82663, 0.0), 'y8': (666667.0, 0.0), 'y9': (11.098331, 0.0)}.
[INFO 02-12 14:47:31] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
[INFO 02-12 14:47:31] ax.service.ax_client: Generated new trial 4 with parameters {'x0': 12.23234, 'x1': 0.0, 'x2': 1e-06, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.725926, 'x10': 2.874652, 'x11': 135.188564, 'x12': 6.328041, 'x13': 0.0, 'x14': 0.0}.
[INFO 02-12 14:47:40] ax.service.ax_client: Completed trial 4 with data: {'y0': (0.31438, 0.0), 'y1': (-0.216531, 0.0), 'y2': (0.195287, 0.0), 'y3': (0.057991, 0.0), 'y4': (0.021123, 0.0), 'y5': (33.6307, 0.0), 'y6': (3.92987, 0.0), 'y7': (11.4973, 0.0), 'y8': (666667.0, 0.0), 'y9': (12.23234, 0.0)}.
[INFO 02-12 14:47:40] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
[INFO 02-12 14:47:40] ax.service.ax_client: Generated new trial 5 with parameters {'x0': 12.722654, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.243144, 'x10': 1.180144, 'x11': 67.033094, 'x12': 3.020138, 'x13': 0.0, 'x14': 0.0}.
[INFO 02-12 14:47:48] ax.service.ax_client: Completed trial 5 with data: {'y0': (0.218149, 0.0), 'y1': (-0.295022, 0.0), 'y2': (0.232367, 0.0), 'y3': (0.110812, 0.0), 'y4': (0.02879, 0.0), 'y5': (34.1826, 0.0), 'y6': (3.73821, 0.0), 'y7': (7.16634, 0.0), 'y8': (666667.0, 0.0), 'y9': (12.722654, 0.0)}.
[INFO 02-12 14:47:48] ax.modelbridge.transforms.standardize_y: Outcome y8 is constant, within tolerance.
Traceback (most recent call last):
File "/moo.py", line 538, in <module>
parameters, trial_index = ax_client.get_next_trial()
File "/python/env/lib/python3.9/site-packages/ax/utils/common/executils.py", line 147, in actual_wrapper
return func(*args, **kwargs)
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 355, in get_next_trial
generator_run=self._gen_new_generator_run(), ttl_seconds=ttl_seconds
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 1344, in _gen_new_generator_run
return not_none(self.generation_strategy).gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 330, in gen
return self._gen_multiple(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 471, in _gen_multiple
generator_run = _gen_from_generation_step(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 833, in _gen_from_generation_step
generator_run = generation_step.gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_node.py", line 111, in gen
return model_spec.gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/model_spec.py", line 170, in gen
return fitted_model.gen(**model_gen_kwargs)
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/multi_objective_torch.py", line 231, in gen
return super().gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/base.py", line 674, in gen
observation_features, weights, best_obsf, gen_metadata = self._gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/array.py", line 276, in _gen
X, w, gen_metadata, candidate_metadata = self._model_gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/multi_objective_torch.py", line 165, in _model_gen
X, w, gen_metadata, candidate_metadata = self.model.gen(
File "/python/env/lib/python3.9/site-packages/ax/models/torch/botorch_moo.py", line 296, in gen
full_objective_thresholds = infer_objective_thresholds(
File "/python/env/lib/python3.9/site-packages/ax/models/torch/botorch_moo_defaults.py", line 514, in infer_objective_thresholds
raise AxError("There are no feasible observed points.")
ax.exceptions.core.AxError: There are no feasible observed points.
Also, one other point, if I disable the SOBOL trials (the first Generation Step), then it seems to recognize the initial data points. In the log out below, you can see most of the data points are correctly in range. Is the Sobol step overriding the initial data set? Considering the output from Sobol, this would obviously be the best starting point. However, the program crashes when trying to get the next trial.
[INFO 02-12 15:50:22] ax.service.ax_client: Starting optimization with verbose logging. To disable logging, set the `verbose_logging` argument to `False`. Note that float values in the logs are rounded to 6 decimal points.
[INFO 02-12 15:50:22] ax.service.utils.instantiation: Due to non-specification, we will use the heuristic for selecting objective thresholds.
[INFO 02-12 15:50:22] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='x0', parameter_type=FLOAT, range=[5.0, 13.0]), RangeParameter(name='x1', parameter_type=FLOAT, range=[5e-11, 2.5e-10]), RangeParameter(name='x2', parameter_type=FLOAT, range=[3e-07, 6e-07]), RangeParameter(name='x3', parameter_type=FLOAT, range=[3e-10, 7e-10]), RangeParameter(name='x4', parameter_type=FLOAT, range=[1.5e-10, 6e-10]), RangeParameter(name='x5', parameter_type=FLOAT, range=[1e-12, 3e-11]), RangeParameter(name='x6', parameter_type=FLOAT, range=[1e-12, 6e-11]), RangeParameter(name='x7', parameter_type=FLOAT, range=[1e-12, 2e-11]), RangeParameter(name='x8', parameter_type=FLOAT, range=[1e-12, 2e-11]), RangeParameter(name='x9', parameter_type=FLOAT, range=[0.3, 2.0]), RangeParameter(name='x10', parameter_type=FLOAT, range=[1.0, 10.0]), RangeParameter(name='x11', parameter_type=FLOAT, range=[5.0, 150.0]), RangeParameter(name='x12', parameter_type=FLOAT, range=[1.0, 25.0]), RangeParameter(name='x13', parameter_type=FLOAT, range=[1e-11, 6e-11]), RangeParameter(name='x14', parameter_type=FLOAT, range=[1.5e-08, 1e-07])], parameter_constraints=[]).
Initializing trials...
[INFO 02-12 15:50:22] ax.service.ax_client: Completed trial 0 with data: {'y0': (0.822588, None), 'y1': (-0.444993, None), 'y2': (0.149345, None), 'y3': (1.90744, None), 'y4': (0.565881, None), 'y5': (20.1334, None), 'y6': (6.45355, None), 'y7': (1.34418, None), 'y8': (666667.0, None), 'y9': (7.4, None)}.
Running trials...
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y0 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y1 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y2 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y3 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y4 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y5 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y6 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y7 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y8 is constant, within tolerance.
[INFO 02-12 15:50:22] ax.modelbridge.transforms.standardize_y: Outcome y9 is constant, within tolerance.
Traceback (most recent call last):
File "/moo.py", line 527, in <module>
parameters, trial_index = ax_client.get_next_trial()
File "/python/env/lib/python3.9/site-packages/ax/utils/common/executils.py", line 147, in actual_wrapper
return func(*args, **kwargs)
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 355, in get_next_trial
generator_run=self._gen_new_generator_run(), ttl_seconds=ttl_seconds
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 1344, in _gen_new_generator_run
return not_none(self.generation_strategy).gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 330, in gen
return self._gen_multiple(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 471, in _gen_multiple
generator_run = _gen_from_generation_step(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_strategy.py", line 833, in _gen_from_generation_step
generator_run = generation_step.gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/generation_node.py", line 111, in gen
return model_spec.gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/model_spec.py", line 170, in gen
return fitted_model.gen(**model_gen_kwargs)
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/multi_objective_torch.py", line 231, in gen
return super().gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/base.py", line 674, in gen
observation_features, weights, best_obsf, gen_metadata = self._gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/array.py", line 276, in _gen
X, w, gen_metadata, candidate_metadata = self._model_gen(
File "/python/env/lib/python3.9/site-packages/ax/modelbridge/multi_objective_torch.py", line 165, in _model_gen
X, w, gen_metadata, candidate_metadata = self.model.gen(
File "/python/env/lib/python3.9/site-packages/ax/models/torch/botorch_moo.py", line 298, in gen
X_observed=not_none(X_observed),
File "/python/env/lib/python3.9/site-packages/ax/utils/common/typeutils.py", line 34, in not_none
raise ValueError(message or "Argument to `not_none` was None.")
ValueError: Argument to `not_none` was None.
@lena-kashtelyan the errors above suggest that the error is originating where Ax is attempting to infer the objective thresholds. This is the same error I was encountering in the Developer API as reported above.
I added some thresholds for the objectives, and seems the error above went away. I think that should be looked into. Separately, it still seems that the initial data point provided via Arm is not considered.
I'm also getting a different error now when calling ax_client.get_pareto_optimal_parameters(), it reports:
File "/moo.py", line 536, in <module>
best_parameters = ax_client.get_pareto_optimal_parameters()
File "/python/env/lib/python3.9/site-packages/ax/service/ax_client.py", line 782, in get_pareto_optimal_parameters
return best_point_utils.get_pareto_optimal_parameters(
File "/python/env/lib/python3.9/site-packages/ax/service/utils/best_point.py", line 351, in get_pareto_optimal_parameters
raise NotImplementedError(
NotImplementedError: Support for outcome constraints is currently under development.
Am I to understand that outcome constraints are not supported in the Service API when identifying best points for multi-objective optimizations?
Am I to understand that outcome constraints are not supported in the Service API when identifying best points for multi-objective optimizations?
That is correct, but it should be easy for us to add that support. Would you mind making a separate issue for this, so we can make it into a ticket internally?
Will look into the error you list above, thank you for the repro! It's very helpful.
@lena-kashtelyan one more point to add here. I did print the data points from the simulations, and I don't see the initial Arm data points. With each run, I do see the log:
[INFO 02-14 19:56:14] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
Which leads me to believe the initial data points are not being considered by the MOO (trials 5+). Any idea what's going on here?
|---:|------------------:|:-------------------|--------------:|:---------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0 | 0 | Sobol | 1 | COMPLETED | {'1_0': {'x0': 7.55, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 0.78, 'x10': 2.58, 'x11': 14.47, 'x12': 16.84, 'x13': 0.0, 'x14': 0.0}} |
| 1 | 0 | Sobol | 2 | COMPLETED | {'2_0': {'x0': 7.07, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 0.81, 'x10': 8.73, 'x11': 41.0, 'x12': 13.19, 'x13': 0.0, 'x14': 0.0}} |
| 2 | 0 | Sobol | 3 | COMPLETED | {'3_0': {'x0': 11.43, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.58, 'x10': 4.11, 'x11': 146.3, 'x12': 10.21, 'x13': 0.0, 'x14': 0.0}} |
| 3 | 0 | Sobol | 4 | COMPLETED | {'4_0': {'x0': 10.93, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 0.63, 'x10': 8.74, 'x11': 14.25, 'x12': 6.71, 'x13': 0.0, 'x14': 0.0}} |
| 4 | 0 | Sobol | 5 | COMPLETED | {'5_0': {'x0': 6.84, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.62, 'x10': 8.58, 'x11': 51.4, 'x12': 13.5, 'x13': 0.0, 'x14': 0.0}} |
| 5 | 1 | MOO | 6 | COMPLETED | {'6_0': {'x0': 8.92, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.53, 'x10': 9.31, 'x11': 57.65, 'x12': 1.24, 'x13': 0.0, 'x14': 0.0}} |
| 6 | 1 | MOO | 7 | COMPLETED | {'7_0': {'x0': 7.1, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 0.94, 'x10': 6.39, 'x11': 142.28, 'x12': 21.84, 'x13': 0.0, 'x14': 0.0}} |
| 7 | 1 | MOO | 8 | COMPLETED | {'8_0': {'x0': 9.21, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.03, 'x10': 5.26, 'x11': 55.59, 'x12': 5.31, 'x13': 0.0, 'x14': 0.0}} |
| 8 | 1 | MOO | 9 | COMPLETED | {'9_0': {'x0': 5.17, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 1.14, 'x10': 3.55, 'x11': 109.17, 'x12': 17.26, 'x13': 0.0, 'x14': 0.0}} |
| 9 | 1 | MOO | 10 | COMPLETED | {'10_0': {'x0': 8.91, 'x1': 0.0, 'x2': 0.0, 'x3': 0.0, 'x4': 0.0, 'x5': 0.0, 'x6': 0.0, 'x7': 0.0, 'x8': 0.0, 'x9': 0.37, 'x10': 4.86, 'x11': 116.48, 'x12': 2.17, 'x13': 0.0, 'x14': 0.0}} |
@lena-kashtelyan wondering if you had any updates or thoughts why my initial data set is getting filtered out? I understood from the conversation above that migrating to the Service API and using model_kwargs={"fit_out_of_design": True}
would allow data points to be included in the initial trials.
Based on the log below, it seems that is not happening.
[INFO 02-14 19:56:14] ax.modelbridge.base: Leaving out out-of-design observations for arms: 0_0
Hi again @dczaretsky, sorry for delay! We'll get back to you on this shortly:
Which leads me to believe the initial data points are not being considered by the MOO (trials 5+). Any idea what's going on here?
Edit: @dczaretsky, figured it out! The out-of-design points are being left out of the data passed to Sobol step (where it doesn't matter, as that step does not fit a model), but not from the data passed to the MOO model. You can see that after the "Completed trial 5 with data" log (where 5 is the last Sobol trial), the log about out-of-design points being left out no longer appears.
If you wanted the log to not appear at all, you can also pass the "fit_out_of_design": True
setting to the Sobol step, but it won't affect your optimization there in any way.
To validate that the MOO model does receive the custom point you pass in by manually creating a trial, you can check the model's training data. Steps for this:
get_next_trial
5 times and complete those trials to get the Sobol trials out of the way,get_next_trial
once so Ax fits the MOO model,ax_client.generation_strategy.model._training_data
, which should include 6 observations: 1 custom that you attached and 5 from Sobol. I did print the data points from the simulations, and I don't see the initial Arm data points
Not sure I understand this part exactly. What is this a printout of? All the trials on the experiment of something else?
Edit: I see, this is a printout of generation_strategy.trials_as_df
. That will only include the trials produced by the Ax generation strategy; to see all trials on the experiment (including the manually attached ones), you can do this: exp_to_df(ax_client.experiment)
, where exp_to_df
comes from ax.service.utils.report_utils
.
@lena-kashtelyan I did create the extra trial with custom inputs, but those data points are (a) not being reported in the data as you described above, and (b) are not influencing the subsequent trials.
To print out the trials I'm using the following:
df = trials_to_df(ax_client.generation_strategy)
df.to_csv('trials.csv')
As demonstrated above, the custom input simulation is not showing up in the data. The MOO trials are random and nowhere close to the custom input which is reasonable close to optimal (8/10 goals satisfied). Even after 100 iterations I don't see the MOO simulations converging (e.g. none of the outcome constrains or min/max goals fall in range.). I suspect these 2 issues are related.
Can you tell me:
For reference again, these are the input data points that should be visible in the data:
### Init with starting points
print("\nInitializing trials...\n")
init_data_points = [
{"x0": 7.4,"x1": 2e-10,"x2": 4.5141866178851045e-07,"x3": 2.3249939526831138e-10,"x4": 2e-10,"x5": 1e-12,"x6": 3.0684166087952084e-11,"x7": 10e-12,"x8": 10e-12,"x9": 0.7374054903263668,"x10": 15.0,"x11": 20,"x12": 15.0,"x13": 1e-12,"x14": 46e-9, "x15": 20.1334},
]
for d in init_data_points:
trial = ax_client.experiment.new_trial()
trial.add_arm(Arm(parameters=d))
trial.mark_running(no_runner_required=True)
ax_client.complete_trial(trial_index=trial.index, raw_data=evaluate(d))
Edit: I've printed out the data for 10 trials using the exp_to_df that you recommended, so I can now see that the initial data points are included. What I'm still not sure is why are those data data points ignored in trials 5+ since they are closest to optimal.
| | y0 | y4 | y8 | y2 | y1 | y3 | y9 | y5 | y10 | y6 | y7 | trial_index | arm_name | x2 | x3 | x1 | x4 | x5 | x9 | x10 | x12 | x6 | x13 | x0 | x7 | x8 | x11 | x14 | trial_status | generation_method |
|---:|----------:|-----------:|------------:|---------:|-----------:|----------:|-----------:|---------:|--------------:|---------:|----------:|--------------:|-----------:|------------:|------------:|------------:|------------:|------------:|---------:|---------:|---------:|------------:|------------:|---------:|------------:|------------:|--------:|------------:|:---------------|:--------------------|
| 0 | 0.822588 | 0.565881 | 666667 | 0.149345 | -0.444993 | 1.90744 | 7.4 | 20.1334 | 2.72073 | 6.45355 | 1.34418 | 0 | 0_0 | 4.51419e-07 | 2.32499e-10 | 2e-10 | 2e-10 | 1e-12 | 0.737405 | 15 | 15 | 3.06842e-11 | 1e-12 | 7.4 | 1e-11 | 1e-11 | 20 | 4.6e-08 | COMPLETED | Manual |
| 2 | 0.213454 | 0.0525083 | 666667 | 0.158377 | -2.31781 | 1.4292 | 11.7083 | 32.3614 | 2.76397 | 6.77042 | 24.6513 | 1 | 1_0 | 3.61888e-07 | 6.72768e-10 | 1.45647e-10 | 5.76189e-10 | 1.5649e-11 | 0.508502 | 4.80827 | 10.8869 | 1.62016e-11 | 2.14321e-11 | 11.7083 | 1.95475e-11 | 1.11967e-11 | 20 | 6.11335e-08 | COMPLETED | Sobol |
| 3 | 0.72714 | 0.168154 | 666667 | 0.136629 | -0.0544575 | 0.0706453 | 6.26783 | 19.9317 | 3.18 | 1.49534 | 7.14776 | 2 | 2_0 | 3.63646e-07 | 4.58984e-10 | 1.19564e-10 | 3.34185e-10 | 1.50048e-11 | 1.84398 | 6.28797 | 17.2874 | 1.5774e-11 | 5.65841e-11 | 6.26783 | 9.32094e-12 | 1.55787e-11 | 20 | 7.09468e-08 | COMPLETED | Sobol |
| 4 | 0.0655752 | 0.00625421 | 666667 | 0.12155 | -0.372475 | 0.023093 | 9.64342 | 33.4343 | 3.46706 | 0.85586 | 12.4441 | 3 | 3_0 | 3.63468e-07 | 4.36798e-10 | 9.93535e-11 | 4.186e-10 | 2.69423e-11 | 0.311181 | 2.40342 | 22.0978 | 1.03144e-11 | 4.56109e-11 | 9.64342 | 6.79461e-12 | 1.2446e-11 | 20 | 1.90987e-08 | COMPLETED | Sobol |
| 5 | 0.791419 | 0.266506 | 666667 | 0.151397 | -0.191026 | 0.340329 | 6.32026 | 21.5078 | 3.403 | 3.26689 | 12.721 | 4 | 4_0 | 3.26223e-07 | 5.72032e-10 | 1.85708e-10 | 2.66299e-10 | 2.76885e-11 | 1.25883 | 3.11285 | 12.9438 | 5.24845e-12 | 5.72013e-11 | 6.32026 | 5.62015e-12 | 1.0812e-11 | 20 | 2.34611e-08 | COMPLETED | Sobol |
| 6 | 0.13553 | 0.0340744 | 666667 | 0.233025 | -0.262597 | 0.0619982 | 6.59829 | 18.9044 | 2.86505 | 1.4064 | 4.65765 | 5 | 5_0 | 4.20121e-07 | 4.74655e-10 | 2.14653e-10 | 4.48932e-10 | 2.62429e-11 | 1.55667 | 5.94128 | 2.96277 | 5.78247e-11 | 2.74045e-11 | 6.59829 | 8.92761e-12 | 1.62633e-11 | 20 | 7.57762e-08 | COMPLETED | Sobol |
| 7 | 0.561962 | 0.203835 | 666667 | 0.182619 | -0.941274 | 1.81142 | 9.34769 | 25.4263 | 2.72006 | 7.33884 | 36.0394 | 6 | 6_0 | 3.09692e-07 | 3.34676e-10 | 9.9877e-11 | 2.41785e-10 | 2.47912e-11 | 0.550169 | 3.07784 | 8.24656 | 4.29246e-11 | 4.52031e-11 | 9.34769 | 1.78856e-11 | 5.3636e-12 | 20 | 9.56418e-08 | COMPLETED | MOO |
| 8 | 0.731199 | 0.226332 | 666667 | 0.133488 | -0.24127 | 0.422416 | 7.3893 | 19.617 | 2.65479 | 3.6317 | 13.0482 | 7 | 7_0 | 4.9627e-07 | 5.85661e-10 | 2.09471e-10 | 5.99538e-10 | 7.49516e-12 | 0.724754 | 7.76902 | 18.5234 | 3.34448e-11 | 4.84145e-11 | 7.3893 | 1.72159e-11 | 1.44052e-12 | 20 | 6.93068e-08 | COMPLETED | MOO |
| 9 | 0.356039 | 0.0893214 | 666667 | 0.140351 | -0.111707 | 0.0806876 | 7.3119 | 20.4551 | 2.79751 | 1.5951 | 5.93959 | 8 | 8_0 | 3.53732e-07 | 5.50677e-10 | 1.87304e-10 | 3.39938e-10 | 2.30672e-11 | 1.8142 | 9.97221 | 16.9461 | 5.75897e-11 | 4.37496e-11 | 7.3119 | 1.98883e-11 | 1.15994e-12 | 20 | 3.32956e-08 | COMPLETED | MOO |
| 10 | 0.723865 | 0.307733 | 666667 | 0.179745 | -0.400548 | 0.781786 | 6.14666 | 15.2371 | 2.47892 | 4.71862 | 18.5711 | 9 | 9_0 | 3.7292e-07 | 4.23949e-10 | 2.24579e-10 | 5.94305e-10 | 9.79907e-12 | 0.8953 | 1.62178 | 7.4267 | 1.88736e-11 | 1.59383e-11 | 6.14666 | 9.63098e-12 | 1.49591e-11 | 20 | 2.76372e-08 | COMPLETED | MOO |
| 1 | 0.0756807 | 0.0182845 | 666667 | 0.146131 | -0.27138 | 0.0296903 | 5.6642 | 16.5012 | 2.91324 | 0.974128 | 4.21129 | 10 | 10_0 | 5.79823e-07 | 3.75162e-10 | 2.4933e-10 | 5.9498e-10 | 7.55372e-12 | 1.43322 | 5.99969 | 15.1097 | 2.00591e-11 | 3.51874e-11 | 5.6642 | 1.53202e-11 | 7.34336e-12 | 20 | 9.88044e-08 | COMPLETED | MOO |
Hi @dczaretsky, sorry it took a bit to get back to you on this –– I thought I had already responded.
To print out the trials I'm using the following:
df = trials_to_df(ax_client.generation_strategy) df.to_csv('trials.csv')
The issue there was that you were only printing out the trials associated with the generation strategy (which the trials you manually attached are not, since they were not produced by the Ax generation strategy). However, they are still fed to our models as part of the training data and show up in exp_to_df
as you show in your edit above. So that covers your question #1 above.
- Why are my output constraints & goals not converging given a close-to-optimal solution (8/10 goals satisfied) ?
Let me make sure I understand your question correctly. You are saying that: 1) You are supplying initial known trials that are out-of-design (i.e. not in the search space), 2) You expect the model to use them to suggest points near those out-of-design optimal/good points, 3) Your parameters are these:
parameters=[
{"name":"x0", "type":"range", "value_type":"float", "bounds":[5.0, 13.0]},
{"name":"x1", "type":"range", "value_type":"float", "bounds":[50.0e-12, 250.0e-12]},
{"name":"x2", "type":"range", "value_type":"float", "bounds":[300.0e-9, 600.0e-9]},
{"name":"x3", "type":"range", "value_type":"float", "bounds":[300.0e-12, 700.0e-12]},
{"name":"x4", "type":"range", "value_type":"float", "bounds":[150.0e-12, 600.0e-12]},
{"name":"x5", "type":"range", "value_type":"float", "bounds":[1.0e-12, 30.0e-12]},
{"name":"x6", "type":"range", "value_type":"float", "bounds":[1.0e-12, 60.0e-12]},
{"name":"x7", "type":"range", "value_type":"float", "bounds":[1.0e-12, 20.0e-12]},
{"name":"x8", "type":"range", "value_type":"float", "bounds":[1.0e-12, 20.0e-12]},
{"name":"x9", "type":"range", "value_type":"float", "bounds":[0.3, 2.0]},
{"name":"x10", "type":"range", "value_type":"float", "bounds":[1.0, 10.0]},
{"name":"x11", "type":"range", "value_type":"float", "bounds":[5.0, 150.0]},
{"name":"x12", "type":"range", "value_type":"float", "bounds":[1.0, 25.0]},
{"name":"x13", "type":"range", "value_type":"float", "bounds":[10.0e-12, 60.0e-12]},
{"name":"x14", "type":"range", "value_type":"float", "bounds":[15.0e-9, 100.0e-9]},
{"name":"x15", "type":"range", "value_type":"float", "bounds":[0, 100]},
],
```,
4) Your initial point is this:
{
"x0": 7.4,
"x1": 2e-10,
"x2": 4.5141866178851045e-07,
"x3": 2.3249939526831138e-10,
"x4": 2e-10,
"x5": 1e-12,
"x6": 3.0684166087952084e-11,
"x7": 10e-12,
"x8": 10e-12,
"x9": 0.7374054903263668,
"x10": 15.0,
"x11": 20,
"x12": 15.0,
"x13": 1e-12,
"x14": 46e-9,
"x15": 20.1334
}
```,
5) But the model doesn't end up exploring the region near that point you are supplying even after 100 trials?
Also what do you mean when you say "8/10 goals satisfied"? In your example above: https://github.com/facebook/Ax/issues/768#issuecomment-1037252217, I'm seeing 4 objectives and 9 constraints.
Closing this out as inactive. Feel free to reopen whenever, but please do so if you have follow-ups, as we might not see the comment on a closed issue.
Thanks. Unfortunately I had to abandon the FB Ax implementation because we couldn’t get a working model. Instead I ended up building a solution from scratch that successfully solved the multi-variable problem.
@dczaretsky is your implementation public? Curious also what you chose to use as a base for the custom solution.
@lena-kashtelyan,
I am adding a comment here as I am having an issue on fitting out of design samples (manually attaching them as suggested here https://github.com/facebook/Ax/issues/768#issuecomment-1036515242 ).
Just some background, I have data that had less strict bounds and wanted to use that data to attach to an experiment that is a bit more restricted. The parameter names for the historical data are identical, the only thing that i am changing here are the bounds.
historical data: old_parameters = [ {"name": chemical_A, "type": "range", "bounds": [0, 100], "value_type": "float"}
new_parameters = [ {"name": chemical_A, "type": "range", "bounds": [50, 65], "value_type": "float"}
Below is an example of the exact code that i am running, aside from the data, i went ahead and made a synthetic dataset (that unfortunately does not reproduce the error). But this should give you an idea of what i am doing here, and this was put together based on what was advised in your comment here https://github.com/facebook/Ax/issues/768#issuecomment-1036515242 I do see that @Balandat mentions creating a new client here https://github.com/facebook/Ax/issues/768#issuecomment-1009050206 and @sgbaird does so here https://github.com/facebook/Ax/issues/768#issuecomment-1036809509
Main Error to address:
ValueError: There are no feasible observed points. This likely means that one or more outcome constraints or objective thresholds is set too strictly.
Another question i had here was due to a warning that gets displayed (can be seen all the way at the bottom of this message) Input data is not contained to the unit cube. Please consider min-max scaling the input data.
the question here is if i should be scaling the manually attached data?
The code w/ synthetic data:
from ax.modelbridge.generation_strategy import GenerationStrategy, GenerationStep
from ax.modelbridge.registry import Models
from ax.service.ax_client import AxClient
from ax.core.arm import Arm
from ax.service.ax_client import AxClient
import numpy as np
import pandas as pd
#create synthetic dataset for reproducibility
# Create a random dataset with 4 features and 1 target
np.random.seed(0)
n = 10 # number of data points
# n samples from a 4-D Dirichlet distribution
samples = np.random.dirichlet(np.ones(4), size=n)
# Multiply each sample by 100 to get values that sum to 100
samples *= 100
# Separate the samples into chem1, chem2, chem3, chem4
chem1, chem2, chem3, chem4 = samples.T
target = np.random.rand(n) * 100
historical_data = pd.DataFrame({'chem1': chem1, 'chem2': chem2, 'chem3': chem3, 'chem4': chem4})
target_df = pd.DataFrame({'target': target})
target_df
# extract min and max bounds for each feature
min_param_vals_lst = [historical_data[historical_data[col] != 0][col].min() for col in historical_data.columns]
min_param_vals_lst
max_param_vals_lst = [historical_data[col].max() for col in historical_data.columns]
max_param_vals_lst
# break bounds found in historical dataset by 5% to allow for optimization
final_min_param_vals_lst = [(val - 5) if val > 7 else val for val in min_param_vals_lst]
final_max_param_vals_lst = [(val + 5) if val < 90 else val for val in max_param_vals_lst]
#setup parameter values for optimization
parameters = [
{"name": chem, "type": "range", "bounds": [min_val, max_val], "value_type": "float"}
for chem,min_val,max_val in zip(historical_data.columns,final_min_param_vals_lst,final_max_param_vals_lst)]
constraint1 = " + ".join(historical_data.columns) + " >= 99.9"
constraint2 = " + ".join(historical_data.columns) + " <= 100.01"
model_kwargs_values = {"num_samples": 512, "warmup_steps": 1000,"fit_out_of_design":True}
#https://github.com/facebook/Ax/issues/768
gs = GenerationStrategy(
steps=[
GenerationStep(
model=Models.FULLYBAYESIAN,
model_kwargs=model_kwargs_values,
# Increasing this may result in better model fits
num_trials=-1,
# batch_size=3
max_parallelism=3,
),
]
)
# set this up for original trials out_of_design
ax_client = AxClient(generation_strategy=gs)
ax_client.create_experiment(
name="Adhesives", #can be any name
parameters=parameters, #list of parameters
objective_name= "target", #name of the objective
parameter_constraints=[constraint1,constraint2],
minimize=False,
)
for i in range(len(historical_data)):
trial = ax_client.experiment.new_trial()
trial.add_arm(Arm(parameters=historical_data.iloc[i,:].to_dict()))
trial.mark_running(no_runner_required=True)
ax_client.complete_trial(trial_index=i, raw_data= target_df.iloc[i].to_dict())
next_experiment, trial_index = ax_client.get_next_trials(max_trials=1)
print("Next Experiment: ", next_experiment)
best_parameters, metrics = ax_client.get_best_parameters()
The Error Message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[155], line 84
75 ax_client.complete_trial(trial_index=i, raw_data= optim_shear.iloc[i].to_dict())
77 # num_of_samples = len(optim_df)
78 # #attaches previous trials
79 # for i in range(num_of_samples):
80 # #attach previous trials
81 # ax_client.attach_trial(optim_df.iloc[i,:].to_dict())
82 # ax_client.complete_trial(trial_index=i, raw_data= optim_shear.iloc[i].to_dict())
---> 84 next_experiment, trial_index = ax_client.get_next_trials(max_trials=1)
85 print("Next Experiment: ", next_experiment)
87 best_parameters, metrics = ax_client.get_best_parameters()
File [/opt/miniconda3/envs/AIO/lib/python3.11/site-packages/ax/service/ax_client.py:626](https://file+.vscode-resource.vscode-cdn.net/opt/miniconda3/envs/AIO/lib/python3.11/site-packages/ax/service/ax_client.py:626), in AxClient.get_next_trials(self, max_trials, ttl_seconds)
624 for _ in range(max_trials):
625 try:
--> 626 params, trial_index = self.get_next_trial(ttl_seconds=ttl_seconds)
627 trials_dict[trial_index] = params
628 except OptimizationComplete as err:
File [/opt/miniconda3/envs/AIO/lib/python3.11/site-packages/ax/utils/common/executils.py:161](https://file+.vscode-resource.vscode-cdn.net/opt/miniconda3/envs/AIO/lib/python3.11/site-packages/ax/utils/common/executils.py:161), in retry_on_exception..func_wrapper..actual_wrapper(*args, **kwargs)
157 wait_interval = min(
158 MAX_WAIT_SECONDS, initial_wait_seconds * 2 ** (i - 1)
159 )
...
--> 277 raise ValueError(NO_FEASIBLE_POINTS_MESSAGE)
278 # construct Objective module
279 if kwargs.get("chebyshev_scalarization", False):
ValueError: There are no feasible observed points. This likely means that one or more outcome constraints or objective thresholds is set too strictly.
The original data does appear to be attached and i get this right before the error is thrown
Input data is not contained to the unit cube. Please consider min-max scaling the input data.
Sample: 100%|██████████| 1512/1512 [02:10, 11.55it/s, step size=3.67e-01, acc. prob=0.892]
Thank you in advance!
Your issue is likely:
constraint1 = " + ".join(historical_data.columns) + " >= 99.9"
constraint2 = " + ".join(historical_data.columns) + " <= 100.01"
It looks like you're trying to implement an equality constraint via two inequality constraints. This is more of a hack and is unlikely to give you the performance you want. In the search space, this makes the "volume" appear as a thin slice, which doesn't work well with volume-based sampling and integration techniques. I also tried this in https://github.com/facebook/Ax/issues/727#issuecomment-974513487 with poor performance. It might throw an error, but it also might just produce bad results. Reparameterizing the search space as in https://github.com/facebook/Ax/issues/727#issuecomment-975644304 is the recommended approach.
@sgbaird,
I adjusted the constraint using the above recommended approach, but i am still getting
ValueError: There are no feasible observed points. This likely means that one or more outcome constraints or objective thresholds is set too strictly.
I did some digging and read through Passing parameter_constraint-s with bounds other than [0,1] ..... where you and @Balandat discuss search_space
's that are not [0,1]^d and how they do not play well in input transforms.
One question i have here, is that when manually attaching data via:
for i in range(len(historical_data)):
trial = ax_client.experiment.new_trial()
trial.add_arm(Arm(parameters=historical_data.iloc[i,:].to_dict()))
trial.mark_running(no_runner_required=True)
ax_client.complete_trial(trial_index=i, raw_data= target_df.iloc[i].to_dict())
should i scale my dataset from [0,1]^d, and could that be the problem? (i was under the impression ax does this internally via UnitX
and StandardizeY
), but i guess that might only be the case when search_space
is being passed to modelbridge
to generate new data points... (not sure if i have that right, some clarification here would be nice).
But from a few discussions now that i have read it appears that sometimes issues can arise from search spaces that are not defined in [0,1]^d.
I went ahead and defined the constraint to be:
constraint1 = " + ".join(param_list[1:]) + " <= 1.01"
I also noticed that when i manually attach data i typically get:
Input data is not contained to the unit cube. Please consider min-max scaling the input data.
cc'ing @dme65 @lena-kashtelyan for any thoughts.
EDIT: @lena-kashtelyan, Just reviving this incase it was missed, and wanted to add a question to this.
In general should we normalize our dataset (maybe something like MinMaxScaler ) before setting up our Ax optimization to avoid any issues with the bounds?
I'm experiencing what I think is the same issue as @ramseyissa. If I update the range of a parameter in the search_space
after having performed a few observations, I get
ValueError: There are no feasible observed points. This likely means that one or more outcome constraints or objective thresholds is set too strictly.
if there are no past observations in the new range, even when using fit_out_of_design=True
.
I believe this is because the observations are still being filtered out in filter_constraints_and_fixed_features when they are out of the new bounds. Is this the expected behavior?
Apologies also if I should have opened a new issue, but it seems relevant to the current discussion.
I'm following the tutorial for implementing the multi-objective optimization. With 15 input variables and 10 output variables, the search space is vast. However, we have some known data points that we know are relatively close to optimal (e.g. 9/10 output variables are satisfied).
I would like to initialize the experiment with some known data points before implementing the pseudo-random sobol. I've been searching through the API and I can see there is some functionality to attach trials, but I can't seem to get this to work in the current tutorial. I'm looking to do something like what I have below.
Can you provide some guidance on how to attach any number of initial trials for multi-objective optimization? Ideally, I would like to be able to programmatically add these data points before any experiment begins. Thanks!