CMA-ES / libcmaes

libcmaes is a multithreaded C++11 library with Python bindings for high performance blackbox stochastic optimization using the CMA-ES algorithm for Covariance Matrix Adaptation Evolution Strategy
Other
321 stars 78 forks source link

CMASolutions reports best candidate's x value incorrect #231

Open godrays opened 2 years ago

godrays commented 2 years ago

Hello,

I found that the CMASolutions does not report best candidate's x value properly.

Here is an example output from 20 iteration:

Best solution: f-value: -6, x: 8.94434
Best solution: f-value: -6, x: 8.41353
Best solution: f-value: -6, x: 8.41353
Best solution: f-value: -6, x: 8.12711
Best solution: f-value: -6, x: 8.12711
Best solution: f-value: -6, x: 8.94491
Best solution: f-value: -6, x: 8.94491
Best solution: f-value: -6, x: 8.56085
Best solution: f-value: -6, x: 8.56085
Best solution: f-value: -6, x: 8.01153
Best solution: f-value: -6, x: 8.21036
Best solution: f-value: -6, x: 8.21036
Best solution: f-value: -6, x: 37.9178
Best solution: f-value: -6, x: 37.9178
Best solution: f-value: -6, x: 37.1124
Best solution: f-value: -6, x: 8.26821
Best solution: f-value: -6, x: 8.26821
Best solution: f-value: -6, x: 8.59635
Best solution: f-value: -6, x: 8.59635
Best solution: f-value: -6, x: 8.84206

I validated that x values in fitness function never gets higher than 21 during simulations, which is expected. You can see some of the x values are over 21. Sometimes I see negative x values too, which is also NOT expected.

Here is the test code I used:

#include <iostream>
#include <limits>
#include <libcmaes/cmaes.h>

using namespace libcmaes;

int main()
{
                      // x: 0  1  2  3   4   5   6   7   8   9  10 11 12 13 14 15 16 17  18  19  20, 21
    std::vector<double>  y{ 3, 2, 1, 0, -1, -2, -3, -4, -6, -3, -1, 2, 5, 7, 8, 9, 8, 3, -2, -4, -5, -1 };

    int minX = std::numeric_limits<int>::max();
    int maxX = std::numeric_limits<int>::min();

    FitFunc fsphere = [&](const double *x, const int N)
    {
        int i = static_cast<int>(x[0]);

        minX = std::min<int>(i, minX);
        maxX = std::max<int>(i, maxX);

        return y[i];
    };

    const int dim = 1;
    const double sigma = 1;
    const int numOfRun = 20;    // run the same optimization n times

    double lbounds[dim];        // lower parameter bounds
    double ubounds[dim];        // upper parameter bounds

    for (int n=0; n<numOfRun; ++n)
    {
        std::vector<double>  x0(dim, 15.0);

        for (int i=0; i<dim; ++i)
        {
            lbounds[i] = 0.0;
            ubounds[i] = static_cast<double>(y.size());
        }

        GenoPheno<pwqBoundStrategy> gp(lbounds, ubounds, dim);
        CMAParameters<GenoPheno<pwqBoundStrategy>> cmaparams(x0, sigma, 100, 0, gp);
        cmaparams.set_algo(aCMAES);
        CMASolutions cmasols = cmaes<GenoPheno<pwqBoundStrategy>>(fsphere, cmaparams);

        std::cout << "Best solution: f-value: " << cmasols.best_candidate().get_fvalue()
                  << ", x: " << cmasols.best_candidate().get_x().front() << std::endl;
    }

    std::cout << "min X: " << minX << ", max X: " << maxX << std::endl;

    return 0;
}

Is this expected behavior? If yes then why?

I compiled the lib with OpenMP=OFF I use v0.10 and built it from the source code.

Thanks

godrays commented 2 years ago

I found a suggestion in other issues saying we needed to use cmasols.get_best_seen_candidate() instead of cmasols.best_candidate()

    auto best_candidate = cmasols.get_best_seen_candidate();

    std::cout << "Best solution: f-value: " << best_candidate.get_fvalue()
              << ", x: " << best_candidate.get_x().front() << std::endl;

Unfortunately, I still see incorrect answers in outputs.. Just see below that x should never be lower than 0.

...
Best solution: f-value: -6, x: 8.78817
Best solution: f-value: -6, x: -8.36663

Here is an other incorrect output.. x should never be higher than 21.

...
Best solution: f-value: -6, x: 8.6857
Best solution: f-value: -6, x: 8.69536
Best solution: f-value: -6, x: 37.5472
Best solution: f-value: -6, x: 37.5472
Best solution: f-value: -6, x: 8.07772
...

Please note that the issue happened less frequently than the first test, but still the same issue.

beniz commented 2 years ago

Hi, see https://github.com/CMA-ES/libcmaes/wiki/Defining-and-using-bounds-on-parameters You are using a genopheno for bounded parameters, thus need to print the solution in pheno space: cmasols.print(std::cout, 0, gp) ? Also, for you own better understanding, the implementation is here: https://github.com/CMA-ES/libcmaes/blob/master/src/cmasolutions.cc#L257 (see the gp.get_pheno(best_candidate()... call.

godrays commented 2 years ago

Thank you for the quick clarification.

Transformation solved my issue. However, cmasols.print(...) implementation you mentioned uses the best_candidate() for report

<< " / x=" << gp.pheno(best_candidate().get_x_dvec()).transpose();

The documentation suggests using best_seen_candidate(), which is confusing.

Eigen::VectorXd bestparameters = gp.pheno(cmasols.get_best_seen_candidate().get_x_dvec());

In this case, anyone who uses cmasols.print(...) might not see the best parameters. What is the exact difference between best_candidate() and best_seen_candidate()? Can we document the diff?

beniz commented 2 years ago

best_seen_candidate() returns the best over the whole run, while best_candidate() returns the current solution.

You should be able to apply gp.pheno to best_seen_candidate(), this is th reason why I pointed to the implementation.

Maybe we should provide a print_best_seen function as well...