numbbo / coco

Numerical Black-Box Optimization Benchmarking Framework
https://numbbo.github.io/coco
Other
262 stars 87 forks source link

independent restarts #395

Closed nikohansen closed 8 years ago

nikohansen commented 8 years ago

Here is my thinking: our code should be implementing everything that many or most users should implement, if they wanted to benchmark a solver properly.

Independent restarts are a very simple but comparatively powerful way to explore the capabilities of many solvers for larger budgets, at the least in the single-objective case. In particular on multi-modal functions they can be a game changer.

The example experiments should generally implement independent restarts (they are implemented in the Python example). This would also give (a small) incentive to add smart termination criteria in a benchmarked solver.

ttusar commented 8 years ago

I agree that we should have some advanced example in addition to the simple one, but let's do this after today's release.

nikohansen commented 8 years ago

I wouldn't make another example for this. IMHO there should be just one, so to say, the example to start from for a user.

nikohansen commented 8 years ago

I will start with the bbob C example experiment as a test case implementation.

brockho commented 8 years ago

BTW, do we have the functionality available to tell Coco that a restart happened as it was the case with the previous fgeneric? Do we (want to) write data into .rdat files?

ttusar commented 8 years ago

The restarts are currently not supported. The bbob logger outputs only the header information in the .rdat file, while the biobj logger doesn't produce any .rdat files.

nikohansen commented 8 years ago

Records in .rdat files would be nice to have, however not as a priority right now. I guess it would mean to have a method like coco_observer_signal_restart(observer), which writes the desired information?

ttusar commented 8 years ago

Sounds reasonable.

brockho commented 8 years ago

Matlab/Octave exampleexperiment.m does now contain restarts by default. Nice if somebody (@ttusar?) can check whether this looks really too complicated for the user. I do not think so.

ttusar commented 8 years ago

I can live with that :), but we should add more comments.

We should also unify the example experiments once we are done tweaking them - after all, there should be one best way to do the example experiment and this should be explained in the documentation and implemented in every language. I can do this when we are ready.

@nikohansen should the condition

if (coco_problem_get_evaluations(problem) <= current_evals)

be replaced by

if (coco_problem_get_evaluations(problem) == current_evals)
brockho commented 8 years ago

I don't understand the current C example experiment anymore I have to say. current_evals did not change since the last algorithm run while coco_problem_get_evaluations(problem) is the current number of function evaluations used. Hence, we would always expect coco_problem_get_evaluations(problem) >= current_evals and coco_problem_get_evaluations(problem) == current_evals iff the last run did not do any evaluations. This does not seem right. Or did I miss something?

Why don't the algorithms have the budget as a parameter? Then, we can simply start the algorithm with the remaining budget (i.e. BUDGET * coco_problem_get_dimension(problem) - coco_problem_get_evaluations(problem)). Changing this will certainly ease the readability of the code, I guess, see the current proposal in the Matlab exampleexperiment.m. And I doubt that there is any algorithm which does not have a maximum budget as parameter (whether it is visible to the outside caller is another question).

nikohansen commented 8 years ago

@ttusar, no (it was done considerately), unless you add an assert for the < part. The question would be: what do you want in this case (which must be considered a bug)? An infinite loop or termination?

ttusar commented 8 years ago

Why?

nikohansen commented 8 years ago

@brockho

nikohansen commented 8 years ago

@ttusar The question would be: what do you want in the < case (which must be considered a bug)? An infinite loop or termination?

brockho commented 8 years ago

I see now for the additional check. It should definitely be commented - otherwise it's more confusing than helping.

Do we also need to add such checks to prevent infinite loops in the other languages? Rephrasing this: How likely is it that an algorithm implementation does not evaluate the objective function when it is called with a budget >1?

nikohansen commented 8 years ago

In the cumulated sense (all users, all algorithms...) rather likely. If this generates an infinite loop, I would consider it to be more than a semi-bug, at least 3/4 ;-)

For, example, if we call Python cma.OOOptimizer twice like

optim = cma.OOOptimizer(...)
optim.optimize(fun, ...)
optim.optimize(fun, ...)

without changing the state of OOOptimizer instance in between, this is the exact behavior we should see in the second call, as the condition under which the optimizer continues/stops iterating hasn't changed. In the C code, if coco_problem_get_evaluations(problem) exceeds the internal algorithm budget, the algorithm is IMHO supposed to not call problem anymore.

nikohansen commented 8 years ago

otherwise it's more confusing than helping.

as treating corner cases is most of the time, unfortunately.

ttusar commented 8 years ago

I think the check for the == case is important and should be added to other languages as well.

The < check seems unnecessary (and it really diminishes the understanding of the code).

nikohansen commented 8 years ago

The < check seems unnecessary

How would that argument not apply to an assert? The question remains: you want rather an infinite loop in case?

ttusar commented 8 years ago

I hate that we have to sacrifice readability for security...

nikohansen commented 8 years ago

Now, even if I add the line

assert(coco_problem_get_evaluations(problem) >= done_evals)

I can't bring myself to change

if (coco_problem_get_evaluations(problem) <= done_evals)

to

if (coco_problem_get_evaluations(problem) == done_evals)

Why? Because AFAIK the compiler is allowed to remove asserts for optimization.

Here is why: the problem pointer is passed to a function which could do anything with it. The optimizer can just change the number of evaluations the problem has. It just can. Whether or not it is supposed to, or whether or not it makes sense or is meaningful, it just can do it.

So I guess then

if (coco_problem_get_evaluations(problem) < done_evals)
   coco_error("something weird happened here")

would be the right thing to do in this case?

ttusar commented 8 years ago

We are just converging back to your first proposition (aka. the current code), aren't we?

brockho commented 8 years ago

I don't know: do we really need to decrease the readability of the code for the case that a user is messing up by changing the number of function evaluations made within the problem? I would rather call it "wrong usage on purpose" than a bug.

nikohansen commented 8 years ago

Re readability: If I see ==, then it means to me this does not apply in the < case. That is, I wonder why the loop does and should continue in this case (which is what I implicitly read).

I guess readability is directly related to how we write such code, and we seem to write code differently in this scenario where some state is never supposed to happen. I try to cover this state if possible and you try to not cover this state if possible. Interesting question then what could be considered as preferable style and why.

Re wrong usage: the question remains, should wrong usage be silently accepted or lead to an infinite loop or stop the program ;-)

I agree that corner case code cluttering is annoying.

nikohansen commented 8 years ago

For example, if I do i += 1 somewhere, I always check for i < max_i or (equivalently) i >= max_i. What I don' t do (never): check for i == max_i because I know i > max_i cannot happen, because I add only 1 at a time. One reason is that == can easier break under later modifications of the code.

nikohansen commented 8 years ago

How likely is it that an algorithm implementation does not evaluate the objective function when it is called with a budget >1?

Assuming that budget must be the only reason why an algorithm terminates is very optimistic. For example, a gradient based algorithm might realize that it cannot do anything within the prescribed budget < dimension, hence do nothing (and possibly throw a warning). That seems to be a pretty reasonable, if not even the only reasonable behavior.

Also: the remaining budget is not necessarily greater than 1.

nikohansen commented 8 years ago

Do we also need to add such checks to prevent infinite loops in the other languages?

I fear the answer is yes, if generic restarts already have been implemented. I will address this problem in the Python example.

brockho commented 8 years ago

I agree with you that the potential problem with "classical" algorithms might actually be a problem and should be prevented. I checked the old code and "voila" it solves the problem of infinite loops elegantly by proposing a maximal number of restarts:

for restarts in xrange(maxrestarts + 1):
    if restarts > 0:
        f.restart('independent restart')  # additional info
    run_optimizer(f.evalfun, dim,  eval(maxfunevals) - f.evaluations,
                   f.ftarget)
    if (f.fbest < f.ftarget
                   or f.evaluations + eval(minfunevals) > eval(maxfunevals)):
        break

I have not been there in 2008/2009 but probably there has been already a similar discussion with exactly this solution. Shall we go this way in all languages?

nikohansen commented 8 years ago

I wouldn't think introducing maxrestarts is a preferable solution to avoid an infinite loop. On the other hand, I see some use in setting the number of restarts to zero or one, or any other number, and the possibility to modulate the overall runtime like this without limiting the overall budget.

Given that maxrestarts can be pretty much any number (in Python there is no upper bound for finite integer values), this isn't a watertight solution to prevent an unintentional quasi-infinite loop.

ttusar commented 8 years ago

Do we now have an "ideal" example experiment all others should emulate? This issue should be resolved as soon as possible...

nikohansen commented 8 years ago

I added a tentative example in C, which implements independent restarts. However, the solver call is still outcommented, because the interface needs to change: the solver needs to have a budget parameter, it cannot use the global BUDGET anymore (this was anyway a rather crude hack). Changing the interface will however break the other example(s). So we should first come to a decision how the interface should exactly look like, and where we expect the user to plug the optimizer / or which code we expect the user to copy-paste and/or change.

ttusar commented 8 years ago

I was about to fix the BUDGET, but I wanted to first check if there is anything else to do (so that I could do everything at the same time).

So, should I take example_bbob as the template for the others (I meant to update the C and Java example experiments)? Should we have two examples (one for each suite) or have a single one where the name of the suite/observer can be easily changed?

ttusar commented 8 years ago

Sorry, I just realized that I somehow skipped the second part of your comment... Yes, we should decide soon.

nikohansen commented 8 years ago

I probably edited my comment while you were answering :)

I tend to be in favor of having only one example experiment and changing the parameters. But I always find the outcome hard to predict, so writing the code and comments (what the user is expected to do) and deciding by reviewing the result is what I tend to do.

I believe we should allow for a rather flexible interface to the desired solver. One way to do so would be to write a generic wrapper (called coco_optimizer in Python), where the user plugs in the solver. This wrapper would be called in the example experiment loop. The wrapper code might be the one we suggest to copy-paste by the user (instead of the looping over all problems code).

Still, the choice of the testbed is another place where the user potentially wants to make changes.

nikohansen commented 8 years ago

Another subtlety: independent restarts make only sense if the algorithm is randomized and doesn't use the same random seed. One way to overcome this subtlety is to randomize the initial solution (if the algorithm accepts as input a single initial solution). One could also randomize the "region of interest". All this depends however on the specific interface the benchmarked optimizer does provide to the outside world.

ttusar commented 8 years ago

The development branch now contains the rewritten example experiment in C. Take a look and let me know if there are any things to be changed. I don't like how big it became (especially compared to the simple one we started from), but hopefully the comments make it understandable.

brockho commented 8 years ago

Although I agree with @ttusar that the code is quite long, I find it more readable than the old C example experiment (before the entire rewrite of the platform).

brockho commented 8 years ago

The Matlab/Octave exampleexperiment.m as well as the SMS-EMOA example allows now for restarts as well (following more or less the implementation of the C experiment).

nikohansen commented 8 years ago

Looks nice and clean to me. One minor comment: I would write problem in capitals, to emphasize that it is a file-global variable. One less minor comment: setting the number of independent restarts to zero is in effect quite similar to not having implemented it: it will mean that more than 80% of the participants will not use it.

ttusar commented 8 years ago

Ok. What should be the default number of independent restarts?

nikohansen commented 8 years ago

IMHO by default restarts should be applied until the budget is exhausted, or the final target is reached. I chose 1e9 in Python, but I guess any large number would fit the bill.

brockho commented 8 years ago

Although we have slight alterations among the four supported languages, all of them implement now restarts (with the number of restarts set to at least 10,000). Shall we simply close the issue and move forward to the next larger release?