Error in TPA for VkD - Githubissues

kostasvar commented 6 years ago

An error coming from TPA appears when I use VkD-CMA for certain functions of the coco bbob-largescale suite. In particular, the error comes from the lines 464 and 469 here - the two symmetric points around the mean along the mean shift are not identified at the set of sampled solutions. I think it's a numerical issue because of the fact that the Mahalanobis norm obtains large values of order 10^13 and the norm of dy becomes very small - see lines 410 and 411 here.

An example for the Sharp Ridge function in dimension 20:

import cma
import numpy as np
import cocoex
from cma import restricted_gaussian_sampler as rgs
N = 20                                                    
k = int(np.log(N))
bbob = cocoex.Suite('bbob-largescale', '', '')
f = bbob.get_problem_by_function_dimension_instance(13, N, 1)  
x0 = 10*(np.random.rand(N)-0.5)
sigma0 = 0.02*10
f._best_parameter('print')
x_opt = np.loadtxt('._bbob_problem_best_parameter.txt')
f_0 = lambda x: f(x_opt + x)
cma.fmin(f_0, x0, sigma0, options = rgs.GaussVkDSampler.extend_cma_options({
    'CMA_sampler_options' : {'kadapt' : True, 'k_init' : k}}))
cma.plot()

VkD - CMA :

sharpridgevkd

CMA:

sharpridgecma

Except the Sharp Ridge function, it appears in other cases too, for example: in dimension 20 for the Weierstrass, Schaffer and Katsuuras functions (and for more functions in larger dimensions).

nikohansen commented 6 years ago

It would be quite useful if we could also see the console output, that is, see what the error actually is and at which iteration/evaluation it starts to occur. (Unrelated hint: use the python directive to format Python code, I did the edit above. Another hint: you can link to a specific line in Github, just click on the line number and you have the link in the browser address bar).

kostasvar commented 6 years ago

It doesn't give any additional information. For the previous run:

(6_w,12)-CMA-ES (mu_w=3.7,w_1=40%) in dimension 20 (seed=819516, Wed Apr 25 15:36:01 2018)
Iterat #Fevals   function value  axis ratio  sigma  min&max std  t[m:s]
    1     12 2.679158387177535e+03 1.0e+00 2.00e-01  2e-01  2e-01 0:00.0
    2     24 2.598995544367311e+03 1.0e+00 2.00e-01  2e-01  2e-01 0:00.0
    3     36 2.495400904925493e+03 1.0e+00 2.00e-01  2e-01  3e-01 0:00.0
  100   1200 1.741548724021021e+02 1.0e+00 2.00e-01  5e-02  1e-01 0:00.2
  200   2400 6.872605627093556e+01 1.0e+00 2.00e-01  6e-04  2e-03 0:00.5
  300   3600 6.734457470030489e+01 1.0e+00 2.00e-01  4e-06  9e-06 0:00.7
  400   4800 6.733614466188762e+01 1.0e+00 2.00e-01  6e-08  2e-07 0:01.0
  500   6000 6.733599695540329e+01 1.0e+00 2.00e-01  1e-09  4e-09 0:01.2
error
error
.
.
.
error
 3591  43092 6.733599307975393e+01 1.0e+00 2.00e-01  6e-12  1e-11 0:11.1
termination on tolx=1e-11 (Wed Apr 25 15:36:17 2018)
final/bestever f-value = 6.733599e+01 6.733599e+01
incumbent solution: [-0.65353758  0.82876551 -1.15283544  0.72297646  0.20924901  0.45386475
  1.34352137  0.15001077 ...]
std deviations: [9.32203819e-12 8.46528304e-12 8.57725603e-12 8.69591441e-12
 6.87024787e-12 8.59784986e-12 8.60702769e-12 6.39391389e-12 ...]
1797 lines written in outcmaesdownaxlen.dat
1797 lines written in outcmaesdownfit.dat
1797 lines written in outcmaesdownstddev.dat
1797 lines written in outcmaesdownxmean.dat
1797 lines written in outcmaesdownxrecentbest.dat
Out[7]: <cma.evolution_strategy.CMADataLogger at 0x112cd98d0>

In rare cases it crashes though, and I am trying to reproduce this now.

nikohansen commented 6 years ago

OK, the printed output does not correspond to the plotted output. I assume that the error messages start at the very same time where the sigma graph has the kink (that's what I get anyways when I run the above code).

AFAICS, for the time being, we should stop the algorithm when this error occurs.

~The simplest solution I can come up with right now: the code raises~

class NumericalPrecisionError(ArithmeticError):
    """lack of numerical precision lead to unreliable results"""

~which we can catch in the experiment. At some point it could also be catched in the caller which adds a stop condition 'numericalprecision' in this case.~

@youheiakimoto what do you think?

EDIT: the reason why the error occurs is in fact not necessarily numerical precision in general. It is just that something close hasn't been in the list, which could also mean that the list is not in a good state. Hence raise RuntimeError("no mirrored vectors found for TPA") might be the better option.

kostasvar commented 6 years ago

You're right, I edited the previous posts with the same run, the error starts at the kink.

nikohansen commented 6 years ago

I'll implement raising a RuntimeError. Hence the call of fmin should go into a try block to catch the error.

nikohansen commented 6 years ago

Alternatively we could issue a warning like

https://github.com/CMA-ES/pycma/blob/25b8957f071c9eafa4474ca10687e9a92302a943/cma/restricted_gaussian_sampler.py#L466-L467

which can be caught like

try:
    warnings.filterwarnings('error', 'no .* mirrored vector found for TPA', RuntimeWarning)
    cma.fmin(...)
except RuntimeWarning:
    print("next restart")

EDIT: it is probably still better to have a

class MirroredVectorNotFound(RuntimeWarning):
    """a mirrored vector as needed in Two-Point Adaptation TPA was not found"""

which can be filtered and caught in the above except clause specifically.

youheiakimoto commented 6 years ago

The current implementation of TPA is not optimal at all. It would be nice if we have an interface to get the indices of symmetric samples. Then, the problem is essentially solved. Fo the time being, a better implementation might be to get the indices of the solutions having the maximum and minimum inner product to the mean shift vector as those of positive and negative symmetric points.

CMA-ES / pycma

Error in TPA for VkD #66