pytorch / botorch

Bayesian optimization in PyTorch
https://botorch.org/
MIT License
3.11k stars 406 forks source link

Is the variable name “max_ref_point” appropriate? #2585

Closed searchlie closed 1 month ago

searchlie commented 1 month ago

Hi. I was reading the botorch code for multi-objective optimization, but I got confused by the function called infer_reference_point. In that place, a variable called "max_ref_point" appears, but it seems to be treated in the same way as the so-called nadir point, that is, the point where all the objective functions are at their worst. If my understanding is correct, since the aim of botorch is to maximize, rather than minimize, shouldn't it be called "min_ref_point" rather than "max_ref_point"? I apologize if I'm wrong. I would be grateful if anyone with more knowledge could give me their opinion.

saitcakmak commented 1 month ago

Hi @searchlie. Thanks for flagging this. You're right that max_ref_point provides a lower bound on the inferred reference point & that it is similar to nadir point in this case. Maybe @sdaulton can provide context on why the name max was used here.

sdaulton commented 1 month ago

The reason that argument is called max_ref_point is because it is actually an upper bound on the inferred reference point. I.e. it says that the inferred reference point should be no larger than max_ref_point (given this is a maximization problem). The motivation for this (which needs more consideration) is that there may be instances where we want to force exploration in areas of the objective space that would not be captured if we used e.g. the nadir as the reference point. For example, consider the case where I want to explore all trade-offs where objectives y1 and y2 are greater than 0, but my observations all have y1 and y2 > 3. Then, the nadir would be >3 with respect to both objectives, so I would never explore the regions of objective space between 0 and 3. max_ref_point allows one to say e.g. that the reference point should be no greater than e.g. 0 with respect to each objective.

searchlie commented 1 month ago

Thank you for your quick response and easy-to-understand explanation. So, if we want to search for a Pareto front with a value that is worse than the observed worst value for at least one objective function, we should not use the worst point (nadir point) as the reference point, right? I understand that max_ref_point is a parameter that is set so that the reference point does not become too large in such cases.