DLR-SC / sqpdfo

Sequential-Quadratic-Programming Derivative-Free Optimization
Other
15 stars 5 forks source link

Improvement idea: Use structure of composite functions #6

Closed freemin7 closed 3 years ago

freemin7 commented 3 years ago

Assuming we have a function f(x) = g(h(x)) we want optimize, and we have white-box access to g(x). Instead of finding a model for f(x), find a model of h(x) (which might be multidimensional) and optimize g(model(x)). To illustrate the benefits let me present a constructed example: f(x) = (sin(x))^2 and we have so far evaluated 0pi, 1pi and 2pi which all resulted in value of 1. The best fit for f(x) is a constant function. However h evaluated at those places yielded 1, -1, 1 so the best fit might be 2x^2 - 1. If we now optimize (2x^2 - 1)^2 we learn that the objective is not only constant but that a plausible approximation could also reach 0 at ±1/(√2) which is not a horrible estimate of pi/2.

While this procedure is roughly n times more expensive where n=dim(h(x)) per iteration, it could be worth it for black boxes with long evaluation times and lead to better approximations in regions where g(x) is not bijective or possibly in the presence of constraints on h(x).

This idea is based on this talk about the same idea for bayesian optimization https://www.youtube.com/watch?v=6Q0HyWeNEoU . An implementation for BO is available here: https://github.com/RaulAstudillo06/BOCF

If not supportable for arbitrary g(x) a talk of people from the DLR might reveal common g(x)'s used there.

Towards the stars, better, Johann-Tobias Schäg

freemin7 commented 3 years ago

A call with the creator of this code made it clear that in it's current use there is no such composite structure to exploit.