CMA-ES / libcmaes

libcmaes is a multithreaded C++11 library with Python bindings for high performance blackbox stochastic optimization using the CMA-ES algorithm for Covariance Matrix Adaptation Evolution Strategy
Other
321 stars 78 forks source link

Custom stopping criteria #133

Closed fergu closed 7 years ago

fergu commented 9 years ago

I am attempting to optimize a function that uses a 4th order polynomial whose coefficients are to be determined by the optimizer.

I am attempting show that multiple runs of the optimizer converge to the same answer of both the function and its dependent parameters, and to what precision we can say we have that answer. The polynomial formed by the coefficients is relatively similar every time, even though the coefficients themselves tend to vary run-to-run. After some consideration, I believe that this is because I am using the polynomial - not the coefficients - as inputs to the function whose output I use, and of course there are many ways to achieve the same polynomial, so having different coefficients makes sense.

My question is that the values of sigma (which I'm assuming is the standard deviation of each value) are still large. I feel that if I decrease the tolerance for sigma that I can require that the parameters be more specifically determined before deciding that the optimization is converged. To this end, I think that overriding the stop criteria is most appropriate.

What is the best way to go about this? I have spent some time reading through the code, but it is hard to tell how far down I need to go to override the stop behavior. Should I completely subclass cmastopcriteria and then pass it to ESOStrategy (And whatever other changes that requires), or is there a more simple way to override this behavior? What sort of functions are expected to be present in cmastopcriteria?

beniz commented 9 years ago

My question is that the values of sigma (which I'm assuming is the standard deviation of each value) are still large.

sigma is the step-size, if you want to see the standard deviaton of each of the final parameter values, use CMASolutions::errors(). Be sure to get the last version of the of the lib as there has been a recent issue with this function.

I don't believe there's a lot of common issues that require overriding the default stopping criteria. Let us know if the per-parameter errors above are not satisfactory.

fergu commented 9 years ago

Oh, thank you! Turns out that I was misunderstanding what each output meant.

I am in the process of working on a custom plotting function to output this information as it would be useful to have at every step. Turns out that it's a bit harder when using a bounding box for the parameters instead of unbounded (as in the examples). I have a bit to learn about template programming, it seems.

Edit: I should say that I can get the basic example with no bound strategy working just fine, and I can get the program to compile when using a bounding strategy, but it crashes when running

(Parenthetical comments are mine, made after the fact)

Got 0x7fff12ad4100 (pfunc)
Plot 0x65bb00 (plotf)
(call to cmaes() here)
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

This is with the plotting function declared as:

PlotFunc<CMAParameters<GenoPheno<pwqBoundStrategy>>,CMASolutions> plotf = [](const CMAParameters<GenoPheno<pwqBoundStrategy>> &cmaparams, const CMASolutions &cmasols, std::ofstream &fplotstream) 

and the optimizer is set up as

GenoPheno<pwqBoundStrategy> gp(lbounds,ubounds,dim); // genotype / phenotype transform associated to bounds.
CMAParameters<GenoPheno<pwqBoundStrategy>> cmaparams(dim,&x0.front(),sigma,-1,0,gp); // -1 for automatically decided lambda, 0 is for random seeding of the internal generator.

// Set up parameters...

ProgressFunc<CMAParameters<GenoPheno<pwqBoundStrategy>>,CMASolutions> pfunc = CMAStrategy<CovarianceUpdate,GenoPheno<pwqBoundStrategy>>::_defaultPFunc;

CMASolutions cmasols = cmaes<>(ff,cmaparams,pfunc,nullptr,cmasols,plotf);

Edit 2: Calling cmaes() as

CMASolutions cmasols = cmaes<GenoPheno<pwqBoundStrategy>>(ff,cmaparams,pfunc,nullptr,cmasols,plotf);

results in the same crash.

I'm guessing I'm doing something stupid and hopefully I'll catch it tomorrow with fresh eyes, but is there any chance that this is a bug?

beniz commented 9 years ago

If I understand correctly, you are willing to override the default plotting function (i.e. the function that write data to file) while using the geno/pheno transform for bounded parameters. Below is a full piece of code that I've tested:

#include "cmaes.h"
#include <iostream>

using namespace libcmaes;

FitFunc fsphere = [](const double *x, const int N)
{
  double val = 0.0;
  for (int i=0;i<N;i++)
    val += x[i]*x[i];
  return val;
};

PlotFunc<CMAParameters<GenoPheno<pwqBoundStrategy>>,CMASolutions> plotf = [](const CMAParameters<GenoPheno<pwqBoundStrategy>> &cmaparams, const CMASolutions &cmasols, std::ofstream &fplotstream)
{
  fplotstream << "kappa=" << cmasols.max_eigenv() / cmasols.min_eigenv() << std::endl; // storing covariance matrix condition number to file.                                                                                                                                                                                  
  return 0;
};

int main(int argc, char *argv[])
{
  int dim = 10; // problem dimensions.                                                                                                                                                                                                                                                                                         
  double sigma = 0.1;
  double lbounds[dim],ubounds[dim]; // arrays for lower and upper parameter bounds, respectively                                                                                                                                                                                                                               
  for (int i=0;i<dim;i++)
    {
      lbounds[i] = -2.0;
      ubounds[i] = 2.0;
    }
  std::vector<double> x0(dim,1.0); // beware that x0 is within bounds.                                                                                                                                                                                                                                                         
  GenoPheno<pwqBoundStrategy> gp(lbounds,ubounds,dim); // genotype / phenotype transform associated to bounds.                                                                                                                                                                                                                 
  CMAParameters<GenoPheno<pwqBoundStrategy>> cmaparams(x0,sigma,-1,0,gp); // -1 for automatically decided lambda, 0 is for random seeding of the internal generator.                                                                                                                                                           
  cmaparams.set_algo(aCMAES);
  cmaparams.set_fplot("pffunc.dat");
  CMASolutions cmasols = cmaes<GenoPheno<pwqBoundStrategy>>(fsphere,cmaparams,CMAStrategy<CovarianceUpdate,GenoPheno<pwqBoundStrategy>>::_defaultPFunc,
                                                            nullptr,CMASolutions(),plotf);
  std::cout << "best solution: " << cmasols << std::endl;
  std::cout << "optimization took " << cmasols.elapsed_time() / 1000.0 << " seconds\n";
  return cmasols.run_status();
}

After running it, the pffunc.dat file should contain something similar to:

kappa=-nan
kappa=1
kappa=1.29860639218778
kappa=1.37157837144256
kappa=1.45304323889142
kappa=1.79211514552514
kappa=1.95989495553559
kappa=2.11347515355253

Let me know if this is not helping out with what you are trying to achieve.

fergu commented 9 years ago

That did it! It turns out that the difference was in the call to cmaes(). I had

    CMASolutions cmasols = cmaes<GenoPheno<pwqBoundStrategy>>(ff,cmaparams,pfunc,nullptr,cmasols,plotf);

Which was based off of an example you gave (Which worked) for custom plotting without this geno/pheno type. You changed it so that cmasols became:

    CMASolutions cmasols = cmaes<GenoPheno<pwqBoundStrategy>>(ff,cmaparams,pfunc,nullptr,CMASolutions(),plotf);

So something must have been getting changed that I (or the library) wasn't expecting in the first form by just using cmasols, while the second (using CMASolutions()) addressed that issue.

Thank you!

beniz commented 9 years ago

good, this is a bug in the example, I've noticed the oddity this morning. I had never noticed it since it seems my compiler lets it pass and that it was running fine on my machines. A fix will clear this issue soon.

On March 24, 2015 4:19:20 PM GMT+01:00, kjfergu notifications@github.com wrote:

That did it! It turns out that the difference was in the call to cmaes(). I had

CMASolutions cmasols = cmaes<GenoPheno>(ff,cmaparams,pfunc,nullptr,cmasols,plotf);

Which was based off of an example you gave (Which worked) for custom plotting without this geno/pheno type. You changed it to

CMASolutions cmasols = cmaes<GenoPheno>(ff,cmaparams,pfunc,nullptr,CMASolutions(),plotf);

So something must have been getting changed that I (or the library) wasn't expecting in the first form, while the second addressed that issue.

Thank you!


Reply to this email directly or view it on GitHub: https://github.com/beniz/libcmaes/issues/133#issuecomment-85548666

Sent from my Android device with K-9 Mail. Please excuse my brevity.

nikohansen commented 9 years ago

I have a two questions, aside from the implementation:

What is (are) the stopping criterion the run(s) terminate with?

If you start with small a sigma from a found solution, do you observe a random walk on a plateau? This should be comparatively easy to diagnose from the standard plots. If you are in doubt don't hesitate to post a plot.

fergu commented 9 years ago

I only recently updated the library to the version which gave this information, but it's exiting with status 1. I have disabled exiting on EQUALFUNVALS due to the relatively low precision of the exterior piece of software we are using (only 2 digits after the decimal), however.

As far as question 2, I have not actually attempted this. I will give it a shot today or tomorrow and report back.

nikohansen commented 9 years ago

it's exiting with status 1

Sorry for my ignorance, but I don't know what that means ;-) Don't we get more specific termination reason from standard verbose output? (Sorry for my ignorance, I haven't tried this recently myself). They are often quite useful for tweaking or "digging deeper", at least I myself regularly look at them.

fergu commented 9 years ago

Oh! I'm sorry, I must have misunderstood your question.

I didn't have the library set up to output verbose logging, but the exit status is 1, which according to cmastopcriteria.h, is

 TOLHISTFUN = 1, // convergence

Which translates to (Also in cmastopcriteria.h)

{TOLHISTFUN,"[Success] The optimization has converged"},

Or, at least.... that's how I've been translating it :)

I just started a run which should output the verbose exit status when it's done - hopefully within a few hours. I can put that output here when I have it.

nikohansen commented 9 years ago

No problem TOLHISTFUN helps already: the likely (only?) reason is that the optimizer has ended up on a plateau. If it would have seen two f-values with a difference of at least 1e-3, this termination criterion would not apply.

nikohansen commented 9 years ago

Here is an "intrusive" suggestion for this case: if in the current sample, say, the median fitness equals to the best fitness, increase sigma, say, by a factor of 1.5. This could be done at the end of each iteration (I believe/hope it should be relatively simple to implement, I would appreciate to see a code snippet for my own education ;-) and should prevent early termination from being on a plateau. This does not necessarily "improve performance", because one might just search longer on the same plateau without finding better solutions.

fergu commented 9 years ago

I would believe that the optimizer has ended up on a plateau - or rather that the surface looks like a number of plateaus. We use an external program to obtain the value that is returned to the optimizer through the fitness function, and that external program only has a precision of 2 digits after the decimal (More is not really needed in the context of what it's meant for).

Thus, there is a range of polynomials that will not alter the output of the external program. Fortunately, the optimizer seems to search over a large enough region that it continually finds better answers. The actual value of the external program converges to the same value (within our tolerances) every time. The big difference is just that sigma remains relatively large even at the end of the simulation - though it seems I have been misinterpreting step size and error.

I am a little swamped today, but I am going to go back and calculate the standard deviation of the fitness function at the end of the simulation just based on the fval history that was output and see what I can say about that. Do you think this is a valid way to determine the degree of accuracy of fval? (I didn't see a function to pull this from the optimizer itself as was available for the x values)

nikohansen commented 9 years ago

I understand, my suggestion could reveal how large a single plateau (or the set of close by plateaus) is (are), as one could observe the parameter changes from the point in time where the plateau is reached, without that the step-size becomes too small to prevent any more relevant changes.

calculate the standard deviation of the fitness function at the end of the simulation just based on the fval history that was output and see what I can say about that. Do you think this is a valid way to determine the degree of accuracy of fval?

Not really (EDIT: unless it is zero). Did you look at the plot(s)? You should immediately see the best f-values and (I believe) also median and worst f-value per iteration. Again, don't hesitate to post it (a picture is worth a thousand words).

beniz commented 9 years ago

Don't we get more specific termination reason from standard verbose output?

Just FTR, yes we do, the status message is given by CMASolutions::status_msg() at any time.

beniz commented 9 years ago

FYI, https://github.com/beniz/libcmaes/commit/0532e00c983765645bf5d1eef25a7805a0609b66 fixes the custom plot function example.

beniz commented 9 years ago

if in the current sample, say, the median fitness equals to the best fitness, increase sigma, say, by a factor of 1.5. This could be done at the end of each iteration (I believe/hope it should be relatively simple to implement, I would appreciate to see a code snippet for my own education ;-)

A full example how to do this lies below based on the ask/tell framework. Commit https://github.com/beniz/libcmaes/commit/74143cea30bef8640260ff8cc1288ef907398697 adds the ability to force-set sigma, which until now was read-only from a user point of view. The example below can thus only work after you are using the last update to the lib:

#include "cmaes.h"
#include <iostream>

using namespace libcmaes;

FitFunc fsphere = [](const double *x, const int N)
{
  double val = 0.0;
  for (int i=0;i<N;i++)
    val += x[i]*x[i];
  return val;
};

class customCMAStrategy : public CMAStrategy<CovarianceUpdate>
{
public:
  customCMAStrategy(FitFunc &func,
                    CMAParameters<> &parameters)
    :CMAStrategy<CovarianceUpdate>(func,parameters)
  {
  }

  ~customCMAStrategy() {}

 dMat ask()
  {
    return CMAStrategy<CovarianceUpdate>::ask();
  }

void eval(const dMat &candidates,
            const dMat &phenocandidates=dMat(0,0))
  {
    // custom eval.                                                                                                                                                                                                               
    for (int r=0;r<candidates.cols();r++)
      {
        _solutions.get_candidate(r).set_x(candidates.col(r));
        if (phenocandidates.size()) // if candidates in phenotype space are given                                                                                                                                                 
          _solutions.get_candidate(r).set_fvalue(_func(phenocandidates.col(r).data(),candidates.rows()));
        else _solutions.get_candidate(r).set_fvalue(_func(candidates.col(r).data(),candidates.rows()));                                                                                                                  
      }
    update_fevals(candidates.cols());
  }

  void tell()
  {
    CMAStrategy<CovarianceUpdate>::tell();

    // compute median f-value and update sigma if median == best
    double median = 0.0;
    size_t csize = _solutions.candidates().size();
    if (csize % 2 == 0)
      median = (_solutions.get_candidate(csize/2.0-1).get_fvalue() + _solutions.get_candidate(csize/2.0).get_fvalue())/2.0;
    else median = _solutions.get_candidate(csize/2.0).get_fvalue();
    if (median == _solutions.best_candidate().get_fvalue())
      _solutions.set_sigma(_solutions.sigma()*1.5);
  }

bool stop()
  {
    return CMAStrategy<CovarianceUpdate>::stop();
  }
};

int main(int argc, char *argv[])
{
  int dim = 10; // problem dimensions.                                                                                                                                                                                            
  std::vector<double> x0(dim,10.0);
  double sigma = 0.1;

  CMAParameters<> cmaparams(x0,sigma);                                                                                                                                     
  ESOptimizer<customCMAStrategy,CMAParameters<>> optim(fsphere,cmaparams);

  while(!optim.stop())
    {
      dMat candidates = optim.ask();
      optim.eval(candidates);
      optim.tell();
      optim.inc_iter(); // important step: signals next iteration.                                                                                                                                                                
    }
  std::cout << optim.get_solutions() << std::endl;
}

The relevant code here lies in the tell() function. Maybe the median to best fitness equality test could be losen a bit ?

nikohansen commented 9 years ago

Thanks! What is the reason that we have to re-implement ask eval and stop? In the ideal world, I would think I could derive a class and needed only to add one additional method, say, manage_plateaus(sigma_factor=1.5, sample_fraction=0.5), which I then can call after tell, if desired.

nikohansen commented 9 years ago

A detail on sigma and the distribution mean: It should be save to manipulate/set these from the outside after tell has been called (or before ask has been called, which should be the same), but not between ask and tell, because tell "makes assumptions" on the used sample distribution. (This is not true for the covariance matrix C, because pc and C are interdependent, so C should not be changed "arbitrarily".)

beniz commented 9 years ago

What is the reason that we have to re-implement ask eval and stop?

None other than educational purposes in fact. There's not even need to derive a class and the following works fine as well:

#include "cmaes.h"
#include <iostream>

using namespace libcmaes;

FitFunc fsphere = [](const double *x, const int N)
{
  double val = 0.0;
  for (int i=0;i<N;i++)
    val += x[i]*x[i];
  return val;
};

void manage_plateau(CMASolutions &solutions,
                    const double &sigma_factor=1.5)
{
    double median = 0.0;
    size_t csize = solutions.candidates().size();
    if (csize % 2 == 0)
      median = (solutions.get_candidate(csize/2.0-1).get_fvalue() + solutions.get_candidate(csize/2.0).get_fvalue())/2.0;
    else median = solutions.get_candidate(csize/2.0).get_fvalue();
    if (median == solutions.best_candidate().get_fvalue())
      solutions.set_sigma(solutions.sigma()*sigma_factor);
}

int main(int argc, char *argv[])
{
  int dim = 10; // problem dimensions.                                                                                                                         
  std::vector<double> x0(dim,10.0);
  double sigma = 0.1;

  CMAParameters<> cmaparams(x0,sigma);
  //cmaparams.set_quiet(false);                                                                                                                                
  CMAStrategy<CovarianceUpdate> optim(fsphere,cmaparams);

  while(!optim.stop())
    {
      dMat candidates = optim.ask();
      optim.eval(candidates);
      optim.tell();
      manage_plateau(optim.get_solutions());
      optim.inc_iter(); // important step: signals next iteration.                                                                                             
    }
  std::cout << optim.get_solutions() << std::endl;
}
beniz commented 9 years ago

It should be save to manipulate/set these from the outside

Yes, I've thought of the issue before. Typically, for the case right above, the median values are in fact in memory within the CMASolutions object already (for the purpose of stopping criteria). However, they are updated by the default tell(). Since the value changes after tell(), I am not (yet) providing accessors to avoid having potential users misuse the value at the wrong time step. There are two solutions I see, locking the calls, but this seems awkward, and returning the time step along the value(s).

As for sigma and xmean, maybe a kind of locking mechanism...

Just since we're at it, another possibly related potential point of failure is #94.

nikohansen commented 9 years ago

What is puzzling me is that

solutions = optim.get_solutions() solutions.set_sigma(value)

apparently changes the value of sigma in optim. I don't get the logics behind this. EDIT: is it that we do in effect

optim.solutions.sigma = value

?

beniz commented 9 years ago

get_solutions yields a reference to the optimizer solution object. The object is unique to the optimizer and holds the current value of sigma. Does this help ? Maybe I don't see the problem yet.

beniz commented 9 years ago

optim.solutions.sigma = value

yes.

nikohansen commented 9 years ago

Yes, #94 is indeed related. Locking the values between ask and tell seems a good approach to me.

fergu commented 9 years ago

Hello,

I tried to look into seeing how the function value behaved around the search point. Unfortunately, as far as I can tell the default plotting function only returns information on the best candidate for a given generation. I tried adding this information through the use of a custom plotting function and

cmasols.candidates()

but the compiler doesn't seem very happy about me trying to access that variable, no matter how I try to cast it. Is there a certain way it should be accessed?

beniz commented 9 years ago

CMASolutions::candidates() returns a reference to the vector of candidates, as opposed to a copy. Typically the compiler will complain if you are trying this on a const CMASolutions object. You can use the CMASolutions::get_candidates(const int) const in order to access every candidate through a copy instead. If this is not the issue, I guess that posting the code of your custom plotting function would be the best.

fergu commented 9 years ago

Hi,

I apologize for my delayed response. It's been a busy few weeks! Hopefully things will slow down as we go into May, though.

get_candidates(const int) is not a function according to my compiler - did you mean get_candidate(const int)?

In that case I can just loop through the number of candidates and extract the information I need and print it, which should work just fine. I'll give that a test and report back.

beniz commented 9 years ago

@kjfergu yes, get_candidate(const int) const is the right one. And yes you can loop through the candidates safely from within the custom plotting function.

fergu commented 7 years ago

Just realized I never closed this. The above conversation successfully came to the solution I was looking for.