numbbo / coco

Numerical Black-Box Optimization Benchmarking Framework
https://numbbo.github.io/coco
Other
260 stars 86 forks source link

"simple"/separable biobj-functions with non-convex dominated hypervolumes? #862

Open Ulfgard opened 8 years ago

Ulfgard commented 8 years ago

Hi,

maybe i am missing something, but is it possible that there are no simple objective functions, e.g. separable with very simple pareto fronts in x-space but non-convex hypervolume sets?

an example would be the sphere/sphere pair where f_i(x)=(x-m)^T(x-m) is replaced for example by f_i(x)=-exp(-(x-m)^T(x-m)). This would allow to investigate the behaviour of algorithms with different selection criteria wrt the hypervolume. This could also become relevant when investigating the behaviour on more "complex" functions.

Ulfgard commented 8 years ago

an example would be the NSGAIII that would behave very different on both type of volumes

nikohansen commented 8 years ago

I believe there are no functions in the bbob-testbed which flatten out when ||x||->infty. This is a testbed design decision. There are many, many features a testbed could try to test and this feature was decided to be of lesser relevance.

I can see two reason why this decision had been made

Ulfgard commented 8 years ago

Hi,

the property i targeted was not "flattening out" but "non-convex hypervolume". You could choose a different transformation like f_i(x)=log((x-m)^T(x-m)+c) (where c is small), same effect.

I disagree regarding the f-transformation: the hypervolume is not invariant to f-transformations of the objectives, thus transforming the target function will lead to an algorithm optimizing the wrong thing.

I further suspect that this case is not so rare as it should already be relevant for all non-convex funcctions, e.g. rosenbrock

loshchil commented 8 years ago

the hypervolume is not invariant to f-transformations of the objectives

Then consider comparison/rank-based hypervolume, this animal lives in section 6.4.3 of my thesis ;-) https://hal.inria.fr/tel-00823882/document

On Tue, Mar 22, 2016 at 3:04 PM, Oswin Krause notifications@github.com wrote:

Hi,

the property i targeted was not "flattening out" but "non-convex hypervolume". You could choose a different transformation like f_i(x)=log((x-m)^T(x-m)+c) (where c is small), same effect.

I disagree regarding the f-transformation: the hypervolume is not invariant to f-transformations of the objectives, thus transforming the target function will lead to an algorithm optimizing the wrong thing.

I further suspect that this case is not so rare as it should already be relevant for all non-convex funcctions, e.g. rosenbrock

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/numbbo/coco/issues/862#issuecomment-199829491

Ulfgard commented 8 years ago

I do not understand your argument. bbob-biobj measures performance in terms of hypervolume. This is our goal to optimize. Arguing that one could optimize another goal instead is true, but this will collide with this specific benchmark. For example one could argue that one could just use a set of target points as NSGAIII does. This will indeed work, but if your target front is strongly concave you will loose large parts of the hypervolume because you do not have enough points there. Similar arguments can be made for ranking functions, just by considering functions which are on vastly different scale.

The point i am making here is that if the goal is to optimize hypervolume and if the goal is to have a benchmark dataset which tells us something about the capabilities of the algorithms, then we should also have simple example cases of a function which will produce a different type of front.

nikohansen commented 8 years ago

the property i targeted was not "flattening out" but "non-convex hypervolume". You could choose a different transformation like f_i(x)=log((x-m)^T(x-m)+c) (where c is small), same effect.

right, I would describe this still as "flatten out". If I am not mistaken, the derivative approaches zero in this example as well, so for me it's hard to see how this could not be called "flatten out". Even if the derivative would decrease but not converge to zero, "flatten out" would be a wording I might be using. It is just a matter of wording, we do talk about the same thing.

thus transforming the target function will lead to an algorithm optimizing the wrong thing

the problem though is, that we do transform the function values when we make the performance assessment. I guess we could now have hundreds of messages going back and forth on how to assess the performance on MO problems and what the "right thing" is (as opposed to the "wrong thing"), but I suspect also it wouldn't be of much value to have this discussion with me...

I further suspect that this case is not so rare as it should already be relevant for all non-convex funcctions, e.g. rosenbrock

sorry, I don't understand this remark: what does "it" refer to, and what does "relevant for the Rosenbrock function" mean?

nikohansen commented 8 years ago

then we should also have simple example cases of a function which will produce a different type of front.

If am not mistaken, the idea behind the bbob-biobj benchmark is that we compose functions that we regard as relevant single-objective benchmark functions. However, we do not control the result of the composition w.r.t. the pareto set or pareto front, that is, we do not construct certain shapes for these.

I can see the disadvantage of this approach, but I can also see that this might be rather a feature than a bug: namely to not be lead by the shape of the pareto front, as most MO benchmark designers have been so far, could turn out to be a good thing.

The idea to construct the benchmark this way might be a bad one as well. Up to this point, I am not convinced by the argumentation that it is.

I see two argumentation lines that would defeat the idea, namely, that

Of course, there could be others.

loshchil commented 8 years ago

I agree that it is quite likely that the current testbed is not reach enough regarding possible forms of the Pareto front. Indeed, a reshaping could potentially drastically change the performance of some approaches, we only can guess how and yes would be glad to know. However, I also agree that the current way of just combining things together has a value as it does attempt to deal with the question "what is the typical distribution of Pareto fronts in relevant real-world problems I should sample my benchmarks from?" and thus somewhat tries to be unbiased at least in this respect. However, the question is indeed important and a few simple transformed cases would not harm.

Ulfgard commented 8 years ago

Lets not dwell on this for too long. I think it would be too late anyways to change it.

I think that the single objective testbed is indeed a good one. However there are differences between SOO and MOO: in SOO we can argue that any algorithm should be invariant against monotonous transformations of the objective function and thus we do not have to care about those. in MOO this is not the case because monotonous transformations will change the nature of the hyper volume.

I think it is indeed valid to argue that most uni-modal functions we find in practice will be convex, however for multi-modal functions, we will often encounter locally non-convex parts of the hypervolume. having a plot that shows performance of the algorithms in these cases might be very useful for algorithm design.

nikohansen commented 8 years ago

in SOO we can argue that any algorithm should be invariant against monotonous transformations of the objective function

you cannot be serious. I can totally see the advantage of this invariance, but to say that any algorithm should have this invariance is just not sensible to me, not even close. This is not the reason why the SO bbob functions do not flatten out for large absolute values of x.

we will often encounter locally non-convex parts of the hypervolume

I am pretty sure that we encounter many of these in the bbob-biobj benchmark suite. They just might not be

separable with very simple pareto fronts in x-space