pytorch / botorch

Bayesian optimization in PyTorch
https://botorch.org/
MIT License
3.01k stars 383 forks source link

[Feature Request] Acquisition Functions for Top-K Estimation #1733

Open Ryan-Rhys opened 1 year ago

Ryan-Rhys commented 1 year ago

πŸš€ Feature Request

Acquisition function implementation for the task of top-k estimation with diversity as per the following paper:

https://arxiv.org/pdf/2210.01383.pdf

Motivation

Top-k estimation (estimating a set of k optimal designs under a penalty that encourages diversity) is an important task in applications such as virtual screening for drug discovery and materials design.

Additional context

@sangttruong is prepared to open a PR for his implementation.

eytan commented 1 year ago

Awesome! Thank you so much for your contributions in advance, Sang!

Ryan, FYI https://arxiv.org/pdf/2210.01383.pdf is probably a more canonical reference since that is the published NeurIPS 2022 version.

Since many acquisition functions can be built off of HES/EHIG, if it’s not too hard, I wonder if it’s worth implementing something other than top-k at the same time, just to set up a good separation of concerns and class structure. For example, a generic EHIG class that was an abstract base class with a (differentiable) loss function method (from which top-k is a subclass of), or some kind of wrapper that returns a new AF with a given loss function (IIRC we do something similar to this with probabilistic reparameterization).

To test the separation of concerns, it could be helpful to try implementing another simple EHIG AF, such as k-guesses from that paper.

E

On Wed, Mar 8, 2023 at 2:58 PM Ryan-Rhys Griffiths @.***> wrote:

πŸš€ Feature Request

Acquisition function implementation for the task of top-k estimation with diversity as per the following paper:

https://openreview.net/pdf?id=coQhmtxr5SN Motivation

Top-k estimation (estimating a set of k optimal designs under a penalty that encourages diversity) is an important task in applications such as virtual screening for drug discovery and materials design. Additional context

@sangttruong https://github.com/sangttruong is prepared to open a PR for his implementation.

β€” Reply to this email directly, view it on GitHub https://github.com/pytorch/botorch/issues/1733, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAW34LXYN5DCDJHZAWSMJTW3DCFBANCNFSM6AAAAAAVUCONJY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Balandat commented 1 year ago

This would be great to have. More than happy to provide guidance on the implementation.

As Eytan said, a generic setup for this would be nice. Maybe there are also opportunities to share some of the generic components with the entropy-based acquisition functions that were added recently.

sangttruong commented 1 year ago

Thank you for your comment. I agree it would be good to have an EHIG abstract based class with a differentiable loss function. I will also write up a few test AFs from the paper, such as k-guess and top-k, to test out the separation of concerns and class structure.

eytan commented 1 year ago

Hi @sangttruong, just checking in on this. It would be really exciting to get this framework and AF into BoTorch. Let us know if there is any way we can help here, and we are happy to review intermediate diffs to provide feedback.

sangttruong commented 1 year ago

Dear @eytan, I apologize for the delayed response -- it has been quite hectic starting the quarter here at Stanford. The good news is that I have the code almost ready with various loss functions, and I am trying to complete a few details before submitting a pull request. I also have a corresponding tutorial notebook for the sanity check. I am eager to receive your feedback.

eytan commented 1 year ago

No worries! Looking forward to it, thanks again!