experimental-design / bofire

Experimental design and (multi-objective) bayesian optimization.
https://experimental-design.github.io/bofire/
BSD 3-Clause "New" or "Revised" License
188 stars 22 forks source link

Add mixed tanimoto gp surrogate #318

Closed xxEthene closed 7 months ago

xxEthene commented 8 months ago

This PR adds a new surrogate type, MixedTanimotoGPSurrogate, which is designed to be used with datasets that contains MolecularInput's with Fingerprints, Fragments, FingerprintsFragments molecular features, and Continuous and/or Categorical features. Therefore, this surrogate is analogous to MixedSingleTaskGP except it involves MolecularInputs's. In order to provide the flexibility for this surrogate type, some other changes were made which will also be mentioned below:

xxEthene commented 8 months ago

I have made the changes as suggested.

In addition, I have also slightly cleaned up the MixedSingleTaskGPSurrogate _fit function to use the get_continuous_features, get_categorical_features, and get_feature_indices functions.

jduerholt commented 8 months ago

Another small info: we merged in a PR which has refactored the test suite (https://github.com/experimental-design/bofire/pull/327). You need to merge main again into your branch, but the effects will be small. You just have to move the stuff which creates conflicts to the new positions.

In case of problems, I can also help you or do it together with you!

xxEthene commented 8 months ago

Hi @jduerholt, I have merged my branch with the main branch and solved the conflicts :) Please help me check whether it is correct. Thanks for all your constructive feedback!

jduerholt commented 8 months ago

Hi @xxEthene,

many thanks for the updates. I will have a look on them over the christmas days. Sorry for the delay!

Best,

Johannes

xxEthene commented 8 months ago

Hi @jduerholt, I have made some changes to the codes based on the errors previously occurred. I accidentally removed the priors importing in mixed_single_task_gp and kept the test_features.py file when I merged with the main branch. These have been solved and I have moved some tests to the correct place :) However, for the rest failed tests related to strategies, I am not sure about it as I did not modify these files and the tests have passed on my computer. I am looking forward to your feedback on this! Wish you a great Christmas holiday and happy new year~~

jduerholt commented 8 months ago

Hi @xxEthene; just ignore the failing tests, this is due to a new version of formulaic released on the 25th of December (https://pypi.org/project/formulaic/#history) which breaks our tests. @Osburg, can you take care for this?

Regarding your PR: I will do a final review as soon as I am back in office next Tuesday! I wish you also a happy new year!

jduerholt commented 7 months ago

I stumbled over this order vs order_id issue also when working on this PR https://github.com/experimental-design/bofire/pull/279, the change to order_id will result in some failing tests in test_inputs.py, I fixed it in the other PR, you can just copy it over. I would not recommend to merge the other PR into yours.

jduerholt commented 7 months ago

In PR https://github.com/experimental-design/bofire/pull/332 the problems regarding the failing tests in the DoE module are fixed. As soon as it is merged into main, you can merge it in from main.

jduerholt commented 7 months ago

Hi @xxEthene, I let some comments. Sorry for this mess with the order_id. I will investigate the issue from the botorch side further. Just try to change it in the way that I proposed above, and check if it works then.

jduerholt commented 7 months ago

Hi @xxEthene, I created a PR in botorch fixing the issue with the OneHotToNumeric InputTransform. If you are interested, here is the PR: https://github.com/pytorch/botorch/pull/2166

jduerholt commented 7 months ago

Hi @xxEthene, the PR was now merged. Just tell me if you find time to finish this PR, if not I will try to finish it ;)

xxEthene commented 7 months ago

Hi @xxEthene, the PR was now merged. Just tell me if you find time to finish this PR, if not I will try to finish it ;)

Hi @jduerholt, sorry for the delay as my computer is under repairment recently. I will be working on the codes this weekend!

xxEthene commented 7 months ago

Hi @jduerholt, the order_id values are as follows for now:

and I have modified related tests in test_inputs.

jduerholt commented 7 months ago

Hi @jduerholt, the order_id values are as follows for now:

  • ContinuousInput: 1
  • ContinuousDescriptorInput: 2
  • DiscreteInput: 3
  • MolecularInput: 4
  • CategoricalDescriptorInput: 5
  • CategoricalInput: 6

and I have modified related tests in test_inputs.

Hi @xxEthene,

looks good for me, just one thing: you forgot the CategoricalMolecularInput, this one has still order_id 7, it should be between MolecularInput and CategoricalDescriptorInput and carrying the number 5. The following ones should then be 6 and 7.

Can you update this?

Best,

Johannes

xxEthene commented 7 months ago

Hi @jduerholt,

Sorry for this and I have updated the order. I have also formatted the files based on the error messages received. However, the error for the file bofire/strategies/samplers/universal_constraint.py is not detected from my side....I am not very sure about this one.

Best regards, Yuxin

jduerholt commented 7 months ago

However, the error for the file bofire/strategies/samplers/universal_constraint.py is not detected from my side.

Ignore it for now, I think there is some other test regarding the sorting of the features still failing. Can you have a look on this one too? Sorry for iterating this for such a long time!

xxEthene commented 7 months ago

Hi @jduerholt,

The failing tests regarding the sorting of the features are due to the unchanged order_id for outputs. I am so sorry for this. I have updated the order_id for outputs and the order now is:

Hope it can work now!

Best regards, Yuxin

xxEthene commented 7 months ago

Hi @jduerholt, I am so confused by the error messages received from the tests....as everything goes well on my computer and also I did not change anything except the a few lines in universal_constraint.py.

jduerholt commented 7 months ago

It is strange, that it is not occuring locally for you, but the error comes from my side. I overlooked in one of my last PRs something and this seems to be the reason for this behavior. I will put up a PR today and merge it in, then you can merge main again into your PR.

Sorry for this!

jduerholt commented 7 months ago

Should be fixed now, just merge main in. Big sorry for this!

xxEthene commented 7 months ago

Okay, I have done it! :)