On dealing with mixed binary and categorical variables

JunzeYang commented 7 months ago

Hello,

I am currently dealing with a black-box problem that involves numbers of binary variables and categorical variables (5 categories). I believe that the proposed BODi has great potential for application to this high-dimensional problem. However, the experiments and code only cover three scenarios: problems with only binary variables, only categorical variables, and both binary and continuous variables. How can this method be extended to problems that involve both binary and categorical variables?

Additionally, I noticed that when solving for $z{j} = arg max{z∈Z} \alpha (\mathcal{M} (\phi_{ A} (z)))$ in the discrete space of $Z$, optimize.py provides two functions, i.e., optimize_acqf_binary_local_search and optimize_acqf_categorical_local_search, for binary variables and categorical variables, respectively. If I want to apply it to a problem that involves mixed binary and categorical variables, can I directly replace these with the botorch.optim.optimize.optimize_acqf_discrete_local_search from the BOTORCH package since this function uses discrete_choices parameter for each dimension?

Thanks

aryandeshwal commented 7 months ago

Hi, Thanks for your interest in BODi. Regarding your first question about problems that involve both binary and categorical variables, I think there are couple of ways to do this:

We can convert the categorical variables to binary via one-hot encoding and run BODi with binary variables.
If the number of categorical and binary variables are both large, we can use an additive kernel with one using BODi categorical version and other using BODi binary version.

It would be useful if you can provide me with some numbers like input dimensionality, no of categorical variables and no of binary variables?

The right BoTorch method that is relevant to mixed optimization is optimize_acqf_mixed https://botorch.org/api/_modules/botorch/optim/optimize.html#optimize_acqf_mixed.

Please let me know if it helps.

JunzeYang commented 7 months ago

Hi, thank you for your kind instruction. My problem involves 10 binary variables and 30 ordinal variables (5 ordered categories reflecting the degree of intensity). If I were to convert the ordinal variables into one-hot encoding, it might increase the complexity of the problem. Applying the additive kernel (the second one you provided) seems like a good choice.

By the way, when you mention the additive kernel, I assume you’re referring to modifying the generate_random_basis_vector in dictionary_kernel.py, so that each $a_{i}$ in dictionary $A$ includes elements corresponding to both binary variables and ordinal variables. Please correct me if I’m wrong.

As for the optimize_acqf_mixed function, it seems particularly designed for optimizing a mix of continuous and discrete variables.

Thank you again for your explanation.

aryandeshwal commented 7 months ago

Thanks for the information. In your case, I would then suggest go with the following option:

Since the number of binary variables are relatively small, I suggest using BoTorch's categorical kernel for the binary part and BODi dictionary kernel for the categorical part. We can define an additive kernel in the covar_module argument required for a BoTorch SingleTaskGP model. This can be done by simply adding objects of two different kernels in the base_kernel argument of ScaleKernel constructor).
For example:

categorical_kernel = CategoricalKernel(active_dims = torch.arange(0, 10)) 
dictionary_kernel = DictionaryKernel(
        num_basis_vectors=n_prototype_vectors,
        categorical_dims=f.categorical_dims,
        num_dims=f.dim,
        similarity=True,
        active_dims=torch.arange(10, 40),
)
covar_module = ScaleKernel(
      base_kernel=categorical_kernel + dictionary_kernel,
      outputscale_prior=GammaPrior(torch.tensor(2.0, **tkwargs), torch.tensor(0.15, **tkwargs)),
      outputscale_constraint=GreaterThan(1e-6)
)

The active_dims argument above states which dimensions should be active for the kernel. I am assuming the first 10 dimensions are binary and rest are categorical. Please let me know if it works. Happy to provide more details.

JunzeYang commented 7 months ago

Alright, thanks for your assistance. I will try out your suggestions. Many thanks!

aryandeshwal / BODi

On dealing with mixed binary and categorical variables #2