Closed sgbaird closed 2 years ago
Hi @sgbaird! Let me think through this a bit and get back to you!
@Balandat and I will get back to you on this one next week, @sgbaird! It's an involved change, so we'll need to make a plan.
Hi again @sgbaird! We discussed this and here's what we're thinking:
BoTorchModel
to hook that component into Ax.What do you think? cc @Balandat and @dme65 for more thoughts on this, also : )
This is great! I'm looking forward to working on this. As for the constraints, could you clarify if the idea is for me to work on my problem-specific constraints or produce a general interface for constraints? For my problem-specific constraint (limiting the maximum number of components), after giving it some thought, it seems that the only transforms that affect the constraint are UnitX
and StandardizeY
. I'm probably oversimplifying this in my head, and I'm sure I'll figure out the real difficulty further along.
For a more general interface or template, I'd need to give it some more thought and base it off of a few example constraints people might want to implement.
could you clarify if the idea is for me to work on my problem-specific constraints or produce a general interface for constraints?
Definitely just the problem-specific constraints! If you end up so liking working on Ax that you want to work on a general interface, of course we'll appreciate all the help we can get, but certainly didn't mean to suggest that you'd be on the hook for that : )
it seems that the only transforms that affect the constraint are UnitX and StandardizeY
I think that might be right; let me loop in @Balandat for that one!
it seems that the only transforms that affect the constraint are UnitX and StandardizeY
So UnitX
operates on the parameters, while StandardizeY
operates on the outcomes. I don't recall you trying to use some kind of compound constraints that involve both parameters and outcomes, do you? If no outcomes are involved, StandardizeY
shouldn't be a concern. UnitX
, however, would be. I don't think in your case other transforms would be applied that would affect the constraints.
I also know that @dme65 has started to think about this, so maybe he has additional thoughts.
@Balandat good point. I was mistakenly thinking of n_components
as an additional outcome based on other threads where I was exploring the idea, but in this case, it really is just the single objective (compressive strength) that's the outcome. That was my bad.
Hi @sgbaird,
We have added support for non-linear inequality constraints in BoTorch which you need for your ||x||_0 <= 3 constraint: https://github.com/pytorch/botorch/pull/1067. The next step will be for us to figure out the best way to expose this functionality in Ax.
I'm attaching a notebook in the meantime that shows how to use this in BoTorch (you need to be on the main branch) for a toy problem with a similar constraint. The attached file is a notebook, but Github doesn't like .ipynb so I uploaded it as a .txt: Scipy non-linear inequality constraints.txt
In terms of adding this to Ax, we have two paths:
1) Through Models.BOTORCH_MODULAR
a) This setup does not yet support SAASBO –– that support is planned by end of March,
b) When it does support SAASBO, passing callable constraints should just be a matter of specifying an argument to Models.BOTORCH_MODULAR
;
2) Through FullyBayesianBoTorchModel
(SAASBO ~= FullyBayesian) that works on top of legacy set up
a) Extract "nonlinear_inequality_constraints" from kwargs in scipy_optimizer
and pass it down to optimize_acqf
,
b) To pass custom constraints there, we'll need to add model_gen_kwargs={"optimizer_kwargs": {"nonlinear_inequality_constraints": constraint_callables}}
to the SAASBO generation step.
While path 1 doesn't require any changes to Ax, we need support for SAASBO first, so I think we can start with path 2? I can put up a PR for 2.a sometime this week. @dme65 does this sound reasonable?
@sgbaird, I think in the meantime you can start thinking about how you will formulate your constraints! The format should be as expected by optimize_acqf
in pytorch/botorch#1067 : )
Wonderful! I will get started on the constraint formulation. Very excited!
@dme65 I was able to run your example notebook. Very nice! Really appreciated the explanations, so thank you for the Jupyter Notebook format. At first, I thought why not use torch.count_nonzero
, but it seems pretty clear now that there's no grad function for this (i.e. essentially what you described about ||x||_0 <= 3
not being differentiable, but some misleading pytorch forum posts led me to think otherwise initially and find out myself). Clever approach with narrow Gaussians. IIUC, as the fractional prevalence (x
) for a particular component approaches zero, the Gaussian basis function evaluated at x
tends towards 1
(lim_{x-->0} f(x)==1
). So if you have 3
components with fractional prevalences close to zero, then the sum of all the narrow Gaussians is approximately 3
(i.e. an approximation of torch.count_nonzero
with gradients). Brilliant.
Following up on my comment above: https://github.com/facebook/Ax/issues/769#issuecomment-1021499537, @dme65 discovered that supporting this in Ax will require a bit more work than I originally thought; we'll probably need a couple more weeks to get the Ax-side PR into place : (
@lena-kashtelyan ok, no worries! Thank you for the heads up
I took a quick stab at enabling this in Ax in https://github.com/facebook/Ax/pull/794. I think this works, but I haven't really tested it thoroughly and the PR will need some work before we can actually merge it which will have to wait until @lena-kashtelyan is back. Attaching a notebook with an Ax version of the example shared previously. This will require BoTorch main + https://github.com/facebook/Ax/pull/794 to work. The notebook has a few caveats which are listed at the top. I hope this will unblock you for now @sgbaird.
@dme65 this is great, thank you!!
Just to clarify, I should still be able to pass in a linear inequality constraint to optimizer_kwargs
, correct? (i.e. the "sum to one" compositional constraint sum(x[:-1]) <= 1
where x[-1] = 1 - sum(x[:-1])
, correct?
Also, I'm getting occasional assertion warnings (changed these to warnings to probe a bit more) where it's suggesting 4
or 5
non-zero components, often with still fairly low fractional prevalences:
for arm in trial.arms:
arm._parameters = {k: 0.0 if v < 1e-3 else v for k, v in arm.parameters.items()}
n_comp = sum([v > 1e-3 for v in arm.parameters.values()])
if n_comp > 3:
warnings.warn(f"n_comp == {n_comp} ! <= 3, v: {arm.parameters}")
Iteration: 0, Best in iteration -0.004, Best so far: -0.329 Iteration: 1, Best in iteration -0.178, Best so far: -0.329 C:\Users\sterg\AppData\Local\Temp\ipykernel_28564\3340465626.py:55: UserWarning: n_comp == 4 ! <= 3, v: {'x0': 0.0, 'x1': 0.11563452134735948, 'x2': 0.0, 'x3': 0.006604924689092644, 'x4': 0.19968109587658064, 'x5': 0.6645668999536406} Iteration: 2, Best in iteration -0.772, Best so far: -0.772 Iteration: 3, Best in iteration -0.497, Best so far: -0.772 Iteration: 4, Best in iteration -0.000, Best so far: -0.772 Iteration: 5, Best in iteration -0.175, Best so far: -0.772 C:\Users\sterg\AppData\Local\Temp\ipykernel_28564\3340465626.py:55: UserWarning: n_comp == 5 ! <= 3, v: {'x0': 0.0, 'x1': 0.018885369278223107, 'x2': 0.11291579534863656, 'x3': 0.0030020011881849587, 'x4': 0.277176485183048, 'x5': 0.6884316391705234} Iteration: 6, Best in iteration -0.965, Best so far: -0.965 Iteration: 7, Best in iteration -0.911, Best so far: -0.965 C:\Users\sterg\AppData\Local\Temp\ipykernel_28564\3340465626.py:55: UserWarning: n_comp == 4 ! <= 3, v: {'x0': 0.0, 'x1': 0.0, 'x2': 0.05389958485271028, 'x3': 1.0, 'x4': 1.0, 'x5': 0.6940677921472179} Iteration: 8, Best in iteration -0.000, Best so far: -0.965 C:\Users\sterg\AppData\Local\Temp\ipykernel_28564\3340465626.py:55: UserWarning: n_comp == 5 ! <= 3, v: {'x0': 0.0, 'x1': 0.21992163953430968, 'x2': 0.27473429165090957, 'x3': 0.0318818731691185, 'x4': 0.3210271559652862, 'x5': 1.0}
I'm guessing this has to do with it being a soft constraint and the width of the Gaussian? Constraint violation seems to get worse as the optimization progresses (in some cases, with all 6 components being non-zero) and even with a narrower Gaussian (ell=1e-4
).
@sgbaird My guess is that you aren't actually on BoTorch main which means that BoTorch ignores the non-linear inequality constraints (reinstalling Ax probably installed a stable version of BoTorch). Can you try to pull from the main branch from BoTorch and then reinstall BoTorch?
Yeah, it should be possible to pass these constraints down. I think the model_gen_options
should look like this:
model_gen_options={
"optimizer_kwargs": {
"nonlinear_inequality_constraints": [ineq_constraint],
"batch_initial_conditions": batch_initial_conditions,
"equality_constraints": [(torch.arange(6), torch.ones(6), 1)], # sum(x) == 1
},
}
You'll also have to make sure that the initial Sobol points satisfy these additional constraints!
@sgbaird My guess is that you aren't actually on BoTorch main which means that BoTorch ignores the non-linear inequality constraints (reinstalling Ax probably installed a stable version of BoTorch). Can you try to pull from the main branch from BoTorch and then reinstall BoTorch?
🤦 Pulled from main, reinstalled BoTorch (pip install -e .
), and now:
>>> import ax
>>> ax.__version__
'0.1.18.dev922'
>>> import botorch
>>> botorch.__version__
'0.6.1.dev31+gc564e333'
That seems to have fixed it. Also, for completeness, I forgot to mention that I also ended up needing to pip install pyro-ppl
to get the code running the first time!
Yeah, it should be possible to pass these constraints down. I think the
model_gen_options
should look like this:model_gen_options={ "linear_constraints": [(torch.arange(5), torch.ones(5), 1)], # sum(x[:-1]) <= 1 "optimizer_kwargs": { "nonlinear_inequality_constraints": [ineq_constraint], "batch_initial_conditions": batch_initial_conditions, "equality_constraints": [(torch.arange(6), torch.ones(6), 1)], # sum(x) == 1 }, }
You'll also have to make sure that the initial Sobol points satisfy these additional constraints!
Awesome, I will give this a try.
Ah, my bad. I was passing down your sum(x[:-1]) <= 1
constraint incorrectly. Btw, it actually isn't needed since sum(x) == 1
and 0 <= x_i <= 1
implies sum(x[:-1]) <= 1
, but if you want to pass it in the right way to do so is to add parameter_constraints=[ParameterConstraint(constraint_dict={f"x{i}": 1 for i in range(5)}, bound=1)]
when creating the search space. I have updated my answer above to not pass in the constraint.
@dme65 thanks for clarifying, and no worries. I ended up using SumConstraint
. The main reason for using the formulation of sum(x[:-1]) <= 1
is that Ax doesn't support equality constraints, so I've been typically using that formulation in the code I've been developing per https://github.com/facebook/Ax/issues/727#issuecomment-975644304. Still working on the adaptation.
I think I misunderstood what you meant earlier. Linear equality constraints are actually supported if you pass them down directly to BoTorch as I did in my suggestion earlier: "equality_constraints": [(torch.arange(6), torch.ones(6), 1)]
corresponds to your sum(x) == 1
constraint. Now, this has the same caveats as the non-linear inequality constraint which is that it won't play well with input transforms, but given that your search space is [0, 1]^d there shouldn't be any issues. It will also be easy to sample a random point that satisfies the non-linear inequality constraint and this linear equality constraint:
I experimented with this equality constraint a bit and it looks like SLSQP actually handles this without any issues, so you have the option of doing what @bernardbeckerman suggested or directly pass down the equality constraint to BoTorch.
@sgbaird, is there anything you need here currently or did you get all the help you needed?
@lena-kashtelyan thanks for checking in! I think I have everything I need for now. It would still be nice to have the option of passing in a predefined list in addition to (the very much coveted) continuous constraints that I'm very excited about; but I've been trying not to ask for everything. You and others have been so helpful!
I've been working on using Ax and got some nice results with both default settings and SAASBO (the latter of which I think set a new benchmark for a certain materials science problem) with hyperparameter optimization.
Still working on the other projects, and excited to share the progress.
Great news, @sgbaird, thank you for sharing an update! Let me close this issue for now then, but please feel free to reopen when there should be more discussion (or when you want to share the results / paper you are referring to) : )
@osburg and @jduerholt did some nice testing of SLSQP + narrow Gaussians vs. another method https://github.com/experimental-design/bofire/issues/145.
Relevant comment by @osburg in https://github.com/experimental-design/bofire/issues/145#issuecomment-1505202182 based in part on empirical testing.
Personally, I have the feeling that it will be a very rare event that SLSQP + narrow gaussians will perform better than IPOPT + nchoosek as bounds.
Yeah I think the narrow Gaussian approximation indeed can have vanishing gradients very quickly. One option to deal with this would be to use a distribution with heavier tails so that the gradients don't vanish as quickly and the optimizer can make progress. cc @SebastianAment who has been thinking about similar things a lot recently.
@sgbaird My guess is that you aren't actually on BoTorch main which means that BoTorch ignores the non-linear inequality constraints (reinstalling Ax probably installed a stable version of BoTorch). Can you try to pull from the main branch from BoTorch and then reinstall BoTorch?
Yeah, it should be possible to pass these constraints down. I think the
model_gen_options
should look like this:model_gen_options={ "optimizer_kwargs": { "nonlinear_inequality_constraints": [ineq_constraint], "batch_initial_conditions": batch_initial_conditions, "equality_constraints": [(torch.arange(6), torch.ones(6), 1)], # sum(x) == 1 }, }
You'll also have to make sure that the initial Sobol points satisfy these additional constraints!
Hi, I'm not sure how this has gone since the initial thread and (very informative!) discussions; I've tried to run the notebook from @dme65 but is getting the error:
TypeError: scipy_optimizer() got an unexpected keyword argument 'nonlinear_inequality_constraints'
Short question, would a reinstall of my scipy version solve this?
@Abrikosoff do you have a full repro of this? Looks like somehow the args get passed to the wrong function, but it will be much easier to debug if you can provide a self-contained example. Thanks!
@Abrikosoff do you have a full repro of this? Looks like somehow the args get passed to the wrong function, but it will be much easier to debug if you can provide a self-contained example. Thanks!
@Balandat Thanks for the quick reply! As you predicted, this was a problem of passing the args to the wrong function; for completeness I include below the (slight) rewrite of the code from @dme65 that worked for me:
import torch
from botorch.acquisition import ExpectedImprovement, qExpectedImprovement
from botorch.fit import fit_gpytorch_model
from botorch.models import SingleTaskGP
from botorch.models.transforms import Standardize
from botorch.optim import optimize_acqf
from botorch.test_functions import Hartmann
from gpytorch.mlls import ExactMarginalLogLikelihood
from torch.quasirandom import SobolEngine
K = 2
def narrow_gaussian(x, ell):
return torch.exp(-0.5 * (x / ell) ** 2)
def ineq_constraint(x, ell=1e-3 ):
"""
Each
callable is expected to take a `(num_restarts) x q x d`-dim tensor as an
input and return a `(num_restarts) x q`-dim tensor with the constraint
values.
"""
# Approximation of || x ||_0 <= 3. The constraint is >= 0 to conform with SLSQP
return narrow_gaussian(x, ell).sum(dim=-1) - K
def get_feasible_sobol_points(n,k):
"""Sobol sequence where we randomly set three of the parameters to zero to satisfy the constraint"""
X = SobolEngine(dimension=6, scramble=True).draw(n).to(torch.double)
inds = torch.argsort(torch.rand(n, 6), dim=-1)[:, :k]
X[torch.arange(X.shape[0]).unsqueeze(-1), inds] = 0
return X
def get_batch_initial_conditions(num_restarts, raw_samples, q, acqf):
X = get_feasible_sobol_points(n=raw_samples*q, k=k).unsqueeze(1)
X = X.reshape((torch.Size((raw_samples,q,6))))
acq_vals = acqf(X)
return X[acq_vals.topk(num_restarts).indices]
hartmann = Hartmann(dim=6)
k = 2
q = 1
X = get_feasible_sobol_points(n=10,k=K)
Y = hartmann(X).unsqueeze(-1)
print(f"Best initial point: {Y.min().item():.3f}")
gp = SingleTaskGP(X, Y, outcome_transform=Standardize(m=1))
mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_model(mll)
EI = qExpectedImprovement(model=gp, best_f=Y.min())
batch_initial_conditions = get_batch_initial_conditions(num_restarts=1, raw_samples=512, acqf=EI, q=q)
print(batch_initial_conditions)
candidate, acq_value = optimize_acqf(
EI,
bounds=torch.cat((torch.zeros(1, 6), torch.ones(1, 6))),
q=q,
nonlinear_inequality_constraints=[ineq_constraint],
batch_initial_conditions=batch_initial_conditions,
num_restarts=20,
options={"batch_limit": 1, "maxiter": 200},
)
print(candidate)
print(ineq_constraint(candidate))
and of course, this implementation is not compatible with the Service API.
however I am still curious as to why the original implementation did not work; was it due to incompatibility between versions of BoTorch? FWIW, I ran the original linked notebook with no modifications. The only difference, as far as I can see, is that instead of doing optimize_acqf as done here, the original was in the form
# Experiment
experiment = Experiment(
name="saasbo_experiment",
search_space=search_space,
optimization_config=optimization_config,
runner=SyntheticRunner(),
)
# Initial Sobol points (set three random parameters to zero)
sobol = Models.SOBOL(search_space=experiment.search_space)
for _ in range(N_INIT):
trial = sobol.gen(1)
keys = [f"x{i}" for i in range(6)]
random.shuffle(keys)
for k in keys[:3]:
trial.arms[0]._parameters[k] = 0.0
experiment.new_trial(trial).run()
# Run SAASBO
data = experiment.fetch_data()
for i in range(N_BATCHES):
model = Models.FULLYBAYESIAN(
experiment=experiment,
data=data,
num_samples=256, # Increasing this may result in better model fits
warmup_steps=512, # Increasing this may result in better model fits
gp_kernel="matern", # "rbf" is the default in the paper, but we also support "matern"
torch_dtype=torch.double,
verbose=False, # Set to True to print stats from MCMC
disable_progbar=True, # Set to False to print a progress bar from MCMC
)
batch_initial_conditions = get_batch_initial_conditions(
n=20, X=model.model.Xs[0], Y=model.model.Ys[0], raw_samples=1024
)
with warnings.catch_warnings():
warnings.simplefilter("ignore") # Filter SLSQP warnings
generator_run = model.gen(
BATCH_SIZE,
model_gen_options={
"optimizer_kwargs": {
"inequality_constraints": [ineq_constraint],
"batch_initial_conditions": batch_initial_conditions,
}
},
)
trial = experiment.new_batch_trial(generator_run=generator_run)
for arm in trial.arms:
arm._parameters = {k: 0.0 if v < 1e-3 else v for k, v in arm.parameters.items()}
assert sum([v > 1e-3 for v in arm.parameters.values()]) <= 3
trial.run()
data = Data.from_multiple_data([data, trial.fetch_data()])
new_value = trial.fetch_data().df["mean"].min()
print(
f"Iteration: {i}, Best in iteration {new_value:.3f}, Best so far: {data.df['mean'].min():.3f}"
)
however I am still curious as to why the original implementation did not work; was it due to incompatibility between versions of BoTorch? FWIW, I ran the original linked notebook with no modifications. The only difference, as far as I can see, is that instead of doing optimize_acqf as done here, the original was in the form
Are you sure this is exactly what you ran? The following error reported above seems to suggest you used nonlinear_inequality_constraints
and not inequality_constraints
(which is the scipy arg).
TypeError: scipy_optimizer() got an unexpected keyword argument 'nonlinear_inequality_constraints'
however I am still curious as to why the original implementation did not work; was it due to incompatibility between versions of BoTorch? FWIW, I ran the original linked notebook with no modifications. The only difference, as far as I can see, is that instead of doing optimize_acqf as done here, the original was in the form
Are you sure this is exactly what you ran? The following error reported above seems to suggest you used
nonlinear_inequality_constraints
and notinequality_constraints
(which is the scipy arg).TypeError: scipy_optimizer() got an unexpected keyword argument 'nonlinear_inequality_constraints'
@Balandat Thanks for the reply! You are right, I posted an attempted correction by mistake; for completeness, the following is what happened: the original code (the relevant section) was in this form:
with warnings.catch_warnings():
warnings.simplefilter("ignore") # Filter SLSQP warnings
generator_run = model.gen(
BATCH_SIZE,
model_gen_options={
"optimizer_kwargs": {
"nonlinear_inequality_constraints": [ineq_constraint],
"batch_initial_conditions": batch_initial_conditions,
}
},
)
When I ran this, the "TypeError: scipy_optimizer() got an unexpected keyword argument 'nonlinear_inequality_constraints
" error was thrown. I then looked under to hood of the gen
method (specifically looking for where scipy_optimizer
is called), and saw that one of the args that could be passed to it was called inequality_constraints
. I then modified the code to the form
with warnings.catch_warnings():
warnings.simplefilter("ignore") # Filter SLSQP warnings
generator_run = model.gen(
BATCH_SIZE,
model_gen_options={
"optimizer_kwargs": {
"inequality_constraints": [ineq_constraint],
"batch_initial_conditions": batch_initial_conditions,
}
},
)
(posted above), at which point the error "TypeError: ax.models.torch.botorch_defaults.scipy_optimizer() got multiple values for keyword argument ``inequality_constraints"
got thrown.
I am suspecting that this is a version problem, but I am a bit mystified TBH. Any help would be great appreciated!
As has popped up in various other issues (#727, #745, #750) and per the discussion in a recent meeting with @lena-kashtelyan, @Balandat, and @bernardbeckerman, there is a need for being able to "allow Ax to take in some callable that evaluates the constraint and that we can pass to the optimizer" https://github.com/facebook/Ax/issues/745#issuecomment-991413498, albeit as "an excellent way for people to shoot themselves in the foot":
I started digging through the Ax code but had trouble identifying where this might happen. I was using the service API, setting breakpoints, and stepping into functions. Any ideas on where this callable would get passed to the optimizer?