CMA-ES inconsistencies - Githubissues

mlpack / ensmallen

A header-only C++ library for numerical optimization --

http://ensmallen.org

Other

739 stars 119 forks source link

CMA-ES inconsistencies #70

Open rcurtin opened 5 years ago

rcurtin commented 5 years ago

This is a continued version of mlpack/mlpack#1475, ported to the ensmallen repository because this is a more appropriate place. This was originally opened by @FloopCZ with the following text:

Hi everyone, first of all, thank you for the huge amount of work you have done on this project. After a brief pass through the CMA-ES implementation, I found a few inconsistencies with the documentation.

The documentation of the Optimize function states that the last parameter (iterate) is also used as the starting point of the algorithm. However, this parameter seems to be basically ignored and the starting point is generated randomly here.

When the algorithm does not converge, it suggests to use a smaller step size here, yet there is no way to change the step size.

The lowerBound and upperBound variables don't seem to be true bounds for the variables. As far as I understand the code, those are only used to deduce the initial sigma value and the random starting point.

Have a nice day!

There are some responses in the original issue about how we should handle each point, but the issue is not yet solved.

zoq commented 5 years ago

CMA-ES isn't related to NEAT, but you get to know the optimizer infrastructure.

favre49 commented 5 years ago

CNE also does the same as CMAES for point 1. Are there any drawbacks to initializing the population by adding matrices generated by arma::randn() to the starting point?

zoq commented 5 years ago

Right, no need to restrict the starting point.

favre49 commented 5 years ago

@zoq For point 2, besides your suggestions, could we also include the initial step size as a parameter? Currently, we experience a problem when we use larger bounds, since the initial step size becomes too large. This causes the CMAES implementation to fail tests when the parameters are changed to include larger bounds (For e.g., [-25,25] for the LogisticRegressionTest). This problem is fixed if we change the step size.

Moreover, I think instead of using one set of bounds for all parameters of a candidate, the user should be able to apply constraints to each parameter separately, since that is more practically useful. Please let me know your thoughts on this

zoq commented 5 years ago

Sounds reasonable to me, I think this is already included in #88?

favre49 commented 5 years ago

No, I wanted to make sure it was alright before including it. I'll make the required changes to #88 soon

zoq commented 5 years ago

Sounds good, thanks!

rcurtin commented 5 years ago

Do you want me to wait to release the next version of ensmallen until #88 is merged?

favre49 commented 5 years ago

I'm not sure whether that question was meant for me, but I should let you know that i might not be able to work on #88 much for the next couple weeks, since I'm going to be busy with exams and stuff.

rcurtin commented 5 years ago

No worries, we can just get #90 merged and then release, assuming that that one is basically ready (I think it is, I'll try and review it tomorrow).

favre49 commented 5 years ago

Sure :) Also, about applying box constraints for each parameter, i don't think we currently have any implementation that does that yet. Would it be fine to add a parameter in the constructor for an arma::cube where each slice has the constraints? Or is there some other way you would like to do this?

rcurtin commented 5 years ago

Personally I don't see any issue with an arma::cube for constraints. It might be worth having that as a separate constructor, so that the user can choose to provide two double parameters for the minimum and maximum in each dimension (which is easy to make a hypercube with), or a more complex arma::cube.