davisking / dlib

A toolkit for making real world machine learning and data analysis applications in C++
http://dlib.net
Boost Software License 1.0
13.53k stars 3.37k forks source link

find_max_global boundary definition and auto-log-scale definition cause issues #2585

Closed shusheer closed 2 years ago

shusheer commented 2 years ago

Expected Behavior

Optimise an "easy" function with a maximum at say x=5.0, but which must have non-negative parameters. Upper bound is something not to far off, say 6.5. Should find the local maximum easily.

Current Behavior

Doesn't find the maximum of this toy problem - presumably because to avoid 0.0 as a lower bound, putting e.g. 0.001 means that the exploration of x is performed on a log-scale, and it spends all its time exploring in the region <1.0

Steps to Reproduce

See below code of a toy problem, where 0.0 would provide a divide-by-zero issue so should be avoided. Change LOW_BOUND to 0.0 for correct behaviour, with a risk of and illegal input, or set LOW_BOUND to 0.001 to demonstrate the incorrect (or, at least, unexpectedly poor) behaviour.

The toy problem is 1D and has a parameter that must be positive, where we think the optimum probably lies in the range 0...6.5. The toy function does indeed have a relatively wide Gaussian peak at x=5.0, with a couple of different slopes and some asymptotic behaviour near 0. The peak at x=5.0 is relatively wide, with the entire region from 4.6 - 5.4 being more optimal than anywhere else on the function, so >10% of x-values within the 0...6.5 bounds are in the correct area. This means a truly random search should end up in the correct region after ~10 random samples. So this shouldn't be a "hard" function to optimise if the optimisation is performed in a linear space. However, in a log space, the region 4.6-5.4 out of the range 0.001...6.5 represents only ~1.8% of the total range of the variable, so this explains why the problem goes from trivially easy to unreasonably hard as a result of the auto-log-exploration logic. Obviously the problem only becomes more difficult if the lower bound is set even closer to 0.0.

Suggested Simple Solution

The user doesn't currently have a way to disable log exploration, and adding this would be an API change. How about making auto-log-exploration alternate between linear and log exploration, as you don't know that the user wants to do this? So every second random exploration sample would be taken on a log scale for those variables that match the criteria.

Suggested Improved Solution

Whilst we're at it, why not allow auto-log-exploration of negative bounds as well? If the bounds do not cross zero, then we can log-explore e.g. the range -0.01 to -1000 just as reasonably as we do for +0.01 to +1000 for example. And once we recognise that, we can also recognise that auto-log-exploration is just a special case of increasing search density around a specific region (0.0). So why not provide a mechanism allowing the user to provide a "scale conversion" function that takes a list of floats in the closed interval 0..1 for each dimension, and outputs the resulting values to use - the user can then use this function to increase search density around some points (e.g. log, or quadratic, or whatever) and also avoid returning special-cases like 0.0 if they need to - this can work for floats or ints, it's up to the user to return allowed values. The user also knows that you will treat the 0..1 space as a linear search space, so they can transform this as required. The user is also entirely responsible for converting the closed interval of 0..1 to whatever bounds they want (closed or open). The quadratic local max improvement only operates on non-int variables, so in this case you simply keep those elements of the provided input list constant when calling the "scale conversion" function on the non-int variables, and the "scale conversion" must provide the same (float(int)) output given the same (float) input for any given dimension.

Of course the user could write a wrapper for the function being optimised that does this "scale conversion" - but to make life a lot easier for normal users, you could simply provide a set of such functions, one of which performs the current auto-log behaviour, one of which always performs linear scale+offset to bounds, another that could take an additional argument of a list of illegal values for each variable to avoid, and so on. A handful of such functions would cover the vast majority of use-cases I suspect.

LOW_BOUND = 0.001

import numpy as np import dlib

def testfunc(x): res = -0.05-0.03x-(0.8-min(x,0.8))0.03-0.00025/x+0.2np.exp(-0.5(((x-5.)/0.25)**2))

print(f"x={x},y={res}")

return res

x = dlib.find_max_global(testfunc, [LOW_BOUND], # Lower bound constraint to avoid 0.0 [6.5], # Upper bound constraint 60) # The number of times find_mx_global() will call testfunc()

print(f'\nFound optimal x:{x[0]} y={x[1]} {"which is NOT actually optimal" if x[1]<0 else "which is correct"}') print(f'dlib version {dlib.version}')

davisking commented 2 years ago

Right. But you can use the global_function_search API and do all this stuff already. The find_max_global() function is just a simple convenience wrapper around global_function_search for the most common use-case.

dlib-issue-bot commented 2 years ago

Warning: this issue has been inactive for 35 days and will be automatically closed on 2022-07-14 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

dlib-issue-bot commented 2 years ago

Warning: this issue has been inactive for 43 days and will be automatically closed on 2022-07-14 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

dlib-issue-bot commented 2 years ago

Notice: this issue has been closed because it has been inactive for 45 days. You may reopen this issue if it has been closed in error.