Open JunzeYang opened 7 months ago
Hi, Thanks for your interest in BODi. Regarding your first question about problems that involve both binary and categorical variables, I think there are couple of ways to do this:
It would be useful if you can provide me with some numbers like input dimensionality, no of categorical variables and no of binary variables?
The right BoTorch method that is relevant to mixed optimization is optimize_acqf_mixed
https://botorch.org/api/_modules/botorch/optim/optimize.html#optimize_acqf_mixed.
Please let me know if it helps.
Hi, thank you for your kind instruction. My problem involves 10 binary variables and 30 ordinal variables (5 ordered categories reflecting the degree of intensity). If I were to convert the ordinal variables into one-hot encoding, it might increase the complexity of the problem. Applying the additive kernel (the second one you provided) seems like a good choice.
By the way, when you mention the additive kernel, I assume you’re referring to modifying the generate_random_basis_vector
in dictionary_kernel.py
, so that each $a_{i}$ in dictionary $A$ includes elements corresponding to both binary variables and ordinal variables. Please correct me if I’m wrong.
As for the optimize_acqf_mixed function
, it seems particularly designed for optimizing a mix of continuous and discrete variables.
Thank you again for your explanation.
Thanks for the information. In your case, I would then suggest go with the following option:
Since the number of binary variables are relatively small, I suggest using BoTorch's categorical kernel for the binary part and BODi dictionary kernel for the categorical part.
We can define an additive kernel in the covar_module
argument required for a BoTorch SingleTaskGP model. This can be done by simply adding objects of two different kernels in the base_kernel
argument of ScaleKernel constructor).
For example:
categorical_kernel = CategoricalKernel(active_dims = torch.arange(0, 10))
dictionary_kernel = DictionaryKernel(
num_basis_vectors=n_prototype_vectors,
categorical_dims=f.categorical_dims,
num_dims=f.dim,
similarity=True,
active_dims=torch.arange(10, 40),
)
covar_module = ScaleKernel(
base_kernel=categorical_kernel + dictionary_kernel,
outputscale_prior=GammaPrior(torch.tensor(2.0, **tkwargs), torch.tensor(0.15, **tkwargs)),
outputscale_constraint=GreaterThan(1e-6)
)
The active_dims
argument above states which dimensions should be active for the kernel. I am assuming the first 10 dimensions are binary and rest are categorical. Please let me know if it works. Happy to provide more details.
Alright, thanks for your assistance. I will try out your suggestions. Many thanks!
Hello,
I am currently dealing with a black-box problem that involves numbers of binary variables and categorical variables (5 categories). I believe that the proposed
BODi
has great potential for application to this high-dimensional problem. However, the experiments and code only cover three scenarios: problems with only binary variables, only categorical variables, and both binary and continuous variables. How can this method be extended to problems that involve both binary and categorical variables?Additionally, I noticed that when solving for $z{j} = arg max{z∈Z} \alpha (\mathcal{M} (\phi_{ A} (z)))$ in the discrete space of $Z$,
optimize.py
provides two functions, i.e.,optimize_acqf_binary_local_search
andoptimize_acqf_categorical_local_search
, for binary variables and categorical variables, respectively. If I want to apply it to a problem that involves mixed binary and categorical variables, can I directly replace these with thebotorch.optim.optimize.optimize_acqf_discrete_local_search
from theBOTORCH
package since this function usesdiscrete_choices
parameter for each dimension?Thanks