experimental-design / bofire

Experimental design and (multi-objective) bayesian optimization.
https://experimental-design.github.io/bofire/
BSD 3-Clause "New" or "Revised" License
188 stars 22 forks source link

Initial attempt to incorporate MultiTask GPs #353

Closed jpfolch closed 4 months ago

jpfolch commented 6 months ago

Initial attempt to incorporate MultiTask GPs, with long term plan of incorporating the multi-fidelity algorithm as described in here.

Current issues:

Future work:

jpfolch commented 6 months ago

And I really like the TaskInput solution. I'll get started on it and push the changes.

jduerholt commented 6 months ago

And I really like the TaskInput solution. I'll get started on it and push the changes.

Great, I will think in the meantime about the other open questions.

jpfolch commented 6 months ago

I've now got an implementation of TaskInputs inheriting from CategoricalVariable and it seems to be working fine when it comes to training the model and fitting. A few issues remain:

  1. Using OneHotToNumeric has to be done outside BoTorch's model since the first line of calling the posterior on MultiTaskGPytorchModel is includes_task_feature = X.shape[-1] == self.num_non_task_features + 1 to check if the task features are included in the X (i.e. fully expecting the task_id to be a singular values in a column). It does not apply the input transform until after this, so under one-hot-encoding the check fails and the model returns predictions for all tasks for all inputs. Two ways of fixing this (a) try to fix within BoTorch and see if they accept the changes, or (b) change the transform within BoFire, or (c) using transform before calling posterior and before setting the training_data (current code is doing this).

  2. I am running into issues with pydantic and serialization. When running surrogate_data.model_dump_json() I get the following warning: Expected Union[definition-ref, definition-ref, definition-ref, definition-ref, definition-ref, definition-ref, definition-ref, definition-ref] but got TaskInput - serialized value may not be as expected. I think this is the causing the following problem where parse_obj_as(MultiTaskGPSurrogate, json.loads(jspec)) fails to run as it fails verification by giving the wrong arguments to the wrong classes. Any help on how to fix this would be appreciated.

jduerholt commented 6 months ago

Hi @jpfolch,

as already discussed, you can find a proper implementation for the TaskInput feature here https://github.com/experimental-design/bofire/pull/360 which should be merged soon. Just use this one.

Regarding your point with the one-hot input transform: I did not know this and this is a pity. Including it into botorch will need some time, but your solution will not work with an optimizer as the optimizer will just call posterior. For this reason, we have to use an OrdinalEncoding for the categorical TaskInput.

This is implemented here: https://github.com/experimental-design/bofire/blob/4790d637016f4e482ae8bafa6cb711b66e26b99e/bofire/data_models/features/categorical.py#L245

The actual encoding taken into account by the model is based on the attribute input_preprocessing_specs which is validated/generated here: https://github.com/experimental-design/bofire/blob/ccb0bc2998f35ede3118b490e0ce387244898305/bofire/data_models/surrogates/surrogate.py#L21

For the MultiTaskGP we then have to overwrite it in a way that we assign CategoricalEncodingEnum.ORDINAL for the task feature. If you do this, you should already be able to use the model. For using it in optimization, we then also have to so some adustments. But this will not be a problem.

So, I would recomment to merge main in and change the model to ORDINAL Encoding for the task feature. Note that in standard models, it should still use ONEHOT. You will also need to adjust the input scaler to ignore the task feature, so that it stays as integer.

Best,

Johannes

jpfolch commented 6 months ago

Perfect, thanks! I will get started on fixing the encoding for TaskInputs.

Regarding not being able to pickle Index kernels, @TobyBoyne found the bug within GPyTorch and submitted a PR which has been approved, so it should be fixed eventually, but we will need a temporary fix for the next few months at least.

TobyBoyne commented 6 months ago

Hey, while you're waiting for the patch to hit GPyTorch, you can fix this in BoFire by manually registering the prior. I've demonstrated this in https://github.com/TobyBoyne/bofire/commit/f9df6b096867294ca297f7035dda4f68b58efcac , just copy this change over to your code and you will be able to pickle the kernel :)

jduerholt commented 5 months ago

Hey @jpfolch,

I was curious to know what the botorch guys think about changing the order of operations regarding the input transforms in the posterior call so I created this issue: https://github.com/pytorch/botorch/issues/2232

Let's see ;)

Best,

Johannes

jpfolch commented 5 months ago

Latest commit added the missing validations for the model, and added testing. Overall the multi-task model seems to work now. Only problem I am running into is that the test test_MultiTaskGPModel will fail randomly (this is independent of the test parameters). In particular, the problem happens when calling fit_gpytoch_mll and the error thrown is RunTimeError: Must provide inverse transform to be able to sample from prior, I will try to investigate further in the next few days.

jduerholt commented 5 months ago

Only problem I am running into is that the test test_MultiTaskGPModel will fail randomly (this is independent of the test parameters). In particular, the problem happens when calling fit_gpytoch_mll and the error thrown is RunTimeError: Must provide inverse transform to be able to sample from prior, I will try to investigate further in the next few days.

I have not yet looked at your additions, but do you think this is a bofire or a botorch problem? I will have a look at your changes tmr.

jduerholt commented 5 months ago

Hi @jpfolch,

this looks good overall. I will do a more thorough review as soon as you request it and you know more about the issue with the fitting. To test serialization and desirilization for the added datamodels (the new gp and the new prior), please add example configs in tests/bofire/data_models/specs/surrogates.py and tests/bofire/data_models/specs/priors.py. Everything what is registered there will be automatically tested for serialization. With add_invalid, you can add invalid specs to test your custom validators.

Best,

Johannes

jpfolch commented 5 months ago

Sorry, I haven't been able to work on this due to other commitments, but I should have more time now. Regarding the fit_gpytroch_mll problem. I am still unsure what is causing it, but I think it could be a gpytorch problem based on the following two issues: https://github.com/pytorch/botorch/issues/1323 and https://github.com/pytorch/botorch/issues/1860.

The best solution I have come up with, is to drop the LKJ prior all together until the issue is fixed, and use MultiTaskGPs without a prior on the task covariances.

jpfolch commented 5 months ago

Latest commit adds the example configs to tests/bofire/data_models/specs/surrogates.py and tests/bofire/data_models/specs/priors.py. I have left the LKJ prior in the code, but currently it cannot be used: when used it defaults to None and throws a warning, let me know if this functionality is okay or should be changed.

jduerholt commented 5 months ago

Thx, will have a look over the course of the week.

jpfolch commented 4 months ago

I've made all the requested changes. All tests pass locally :).

jduerholt commented 4 months ago

Hi @jpfolch,

thank you very much, can you also resolve the merge conflicts by merging the main branch into this one?

Best,

Johannes

jpfolch commented 4 months ago

All done, conflicts should be resolved now. I tested, and all tests passed except the EntingStrategy spec, but I don't have entmoot installed which explains it.

jduerholt commented 4 months ago

Tests are running through ;), can you fix the linting errors to?

jpfolch commented 4 months ago

I think the linting errors should be fixed now

jpfolch commented 4 months ago

Hi @jduerholt,

Thank you very much! I've also been slow as I have been away and juggling other things meanwhile. Happy to see it has been merged.

The MultiTaskGP can be used directly to model the fidelities, so I think the next step would be incorporation into the strategy? This is, assuming the strategies are the part of the code where the BO chooses which experiment to do next?

Best, Jose

jduerholt commented 3 months ago

Hi @jpfolch,

now I am back from parental leave and have again more time for BoFire.

Yes, incorporation into the strategies would be the next step. The adaptations has to be made here: https://github.com/experimental-design/bofire/blob/main/bofire/strategies/predictives/botorch.py

Especially, we have to adjust the dealing with the fixed features, as we currently assume that categoricals are always one hot encoded. Honestly, this whole part of the code needs to be cleaned up. Are you interested in doing it, or should I make a first draft?

Best,

Johannes

jduerholt commented 3 months ago

It would be especially important to know how the optimization over the TaskInput should work, do we optimize the ACQF for all allowed tasks and return the candidate with the highest acqf value? Or is somehow also the evaluation cost is considered, or is only the highest fidelity queried? Do we need specific ACQFS for this case?

jpfolch commented 2 months ago

Hi @jduerholt,

Sorry for the delay in reply, I have been busy finishing up work my PhD. I am currently writing up, but I should have some time to work on BoFire. So based on my previous work the way I recommend going about optimizing over TaskInput is:

  1. We first choose the next point of interest by returning the candidate with the highest acqf value at the target fidelity.

  2. We then choose which fidelity to evaluate based on one of two criteria: (a) If the fidelity criterion is strictly ordered, we can simply choose the lowest fidelity until we reduce the uncertainty enough. We would then choose the second lowest fidelity until we reduce uncertainty, and so on... the idea being that we want to use lowest possible fidelity which we believe might still be informative of our target. This criterion is cheap to compute, generally stable, and I found it to work well. However, it does not take cost into account. (b) A second criterion, which is more general and does take cost into account, is to consider the information gain of each fidelity on the target fidelity per unit cost. This is a more general criteria, however, it can be expensive to compute accurately and numerically unstable. There is a BoTorch implementation of the method though, which is probably well optimized (the single-fidelity method is described in this tutorial but there is a multi-fidelity version in the source code).

Approach (b) would be equivalent to first optimizing any acquisition function in the target fidelity, and then optimizing the task-input only with the MF-MES acquisition function. The benefit of this is that it gives you more flexibility in the choice acquisition functions and it is computationally cheaper than optimizing MF-MES directly (which is another possible route).

Let me know which of the two options you would prefer, and I can get started on it.

Best, Jose

jpfolch commented 2 months ago

@bertiqwerty

jduerholt commented 2 months ago

Hi @jpfolch,

totally forgot this. I will have a detailed look tomorrow. Sorry. Too much to do :(

Best,

Johannes

jduerholt commented 1 month ago

Hi @jpfolch,

soory for the delay, the last weeks were crazy. I am not so much into multifidelity optimization, so I have a few questions:

In general, I am always for starting with the easiest option ;)

Best,

Johannes